The Benchling Warehouse is a database solution that tracks and allows you to view assay data, registry entities, and sample data using SQL queries. For an overview of the warehouse, check out the Warehouse Overview or Warehouse Schemas ERD.

Note: This article does not cover the API, which can be used to write data to Benchling.  To access the API, please refer to Write Data via the API [Enterprise].

Configure Your Warehouse

Set up warehouse credentials

Note: Enablement of Warehouse Credential access is a feature that should have been discussed in your implementation scope. Please contact your Customer Success representative to determine if access is included in your package, and to access the Warehouse on your account.

Warehouse credentials are used to access data uploaded to the Benchling warehouse using third party tools such as Spotfire, Tableau etc.

Warehouse credentials are obtained via the Account Settings page on Benchling. From the settings page, users generate new credentials by clicking on “Create credentials” under the Warehouse credentials section.  If you do not see this option, contact your Benchling Customer Success Manager to enable access on your account.

Note: Be sure to copy the password and store it in a safe location (like your password manager) when generating credentials, since you will not be able to access these again. Benchling does not recommend hard-coding these credentials into scripts.

Connect with a tool that supports PostgreSQL

See below for instructions on configuring:

  1. The psql command-line tool and other libpq-based code libraries

  2. JDBC / Java

  3. Third party tools and SaaS


Connect Using psql

For general help with psql, see also the official documentation.

NOTE: It is very important to configure SSL / TLS correctly since psql and libpq-based clients like Python's psycopg2 won't use it by default!

The instructions here set the per-user default and will also work with Python's psycopg2 in addition to psql. If the instructions below don't work, or you need to connect to many PostgreSQL servers, you will need to refer to the official psql and libpq documentation.

On Mac OS and Linux:

Set up the ~/.postgresql/root.crt file. Open Terminal.app and run the following commands (please note this assumes you're using bash as your shell):

# Download the Aurora RDS root certificate to the default location.
curl https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem \
--create-dirs -o ~/.postgresql/root.crt

# Set environment variables to require full verification by default
# and make these settings apply to new shells. Use .zshrc for ZSH.
export PGSSLMODE=verify-ca
echo 'export PGSSLMODE=verify-ca' >> ~/.bashrc

On Windows:

Set up the %APPDATA%\postgresql\root.crt file:

  1. Download the CA certificate file to your Downloads folder and open that folder. You should see a new file named named rds-ca-2019-root.pem

  2. Open a second windows file explorer window (shortcut: Ctrl+N or go File > New Window) and type %APPDATA% in its address bar

  3. If you see a folder called postgresql here, great! But if not, create a new folder named exactly that. Open this folder.

    1. If you can't see or make this folder, contact your IT team

  4. Drag and drop the downloaded rds-ca-2019-root.pem from the Downloads folder to the postgresql folder

  5. Rename the rds-ca-2019-root.pem file to root.crt to set it as the default (recommended)

To connect to your Warehouse, run:

psql 'postgresql://*yourusername*@postgres-warehouse.*yourdeployname*.benchling.com:5432/warehouse?sslmode=verify-ca'

This should prompt for your password. Note: replace *yourusername* and *yourdeployname* with your individual settings.

Connect Using JDBC

Follow the same setup for psql above.


The connection string will also work for Java JDBC connections. See the PostgreSQL JDBC Connector SSL documentation for more info on the SSL / TLS settings. You will want to use the verify-ca mode. You may need to import or otherwise specify the location of the rds-ca-2019-root.pem file if it isn't in the default location.

Connect Using a Third-Party Tool or SaaS

Note: Many tools will require you to enable SSL or TLS to connect -- and it's super important that you do enable it! If you need a Root Certificate Authority ("Root CA") file, you should supply this one to your tool.

To connect to the Warehouse, use the following parameters in your data visualization tool:

  • Host Name: postgres-warehouse.yourdeployname.benchling.com. Be sure to replace yourdeployname with your enterprise specific domain (that means the part before your enterprise's benchling.com URL)

  • Port: 5432 (this is the default)

  • Database: warehouse

  • Username: the username you generated above

  • Password: the password you generated above

  • Enable SSL or TLS mode - this is very important for your data security

    • If possible, you should use the verify-ca SSL / TLS connection mode (it might be called "check hostname" or "check server certificate").

    • This will sometimes require you to download then upload this CA certificate file. If that doesn't work, please ask support to provide you a certificate specific to your Warehouse instance.

    • If the tool also asks for a client certificate, either generate your own or you can use these pre-generated .key and .crt files.

For example:


Reading data from the Warehouse

Connect your tool of choice to the Postgres warehouse (see above). For each run/result, you'll see:

  1. Raw tables (e.g. my_assay_result$raw): contain both reviewed and unreviewed data

  2. View (e.g. my_assay_result): contains only valid, reviewed data

Generally, you'll want to query the view when you want reviewed data. Query the raw table if you want to see all data, including unreviewed data. Note that each run/result is uploaded to a project, and you'll only be able to see runs/results in projects you can access. See our Getting Started doc for more information.

You will need permissions to view and edit the data you are trying to access.  This means that a user will only see data in the warehouse that that user has permission to see in the Benchling UI.  This will mean that different users may see a different number of results for certain queries and may not be able to view a linked item (shown as [No access]) in a linked field.

Permission to view runs and results is derived from the project in which they are placed. Project for a result will be specified through:

  1. If manually entered, the permissions for the assay data is the same as the permissions of the entry in which it is recorded

  2. If uploaded via a script, the project id can be specified at time of upload.

  3. When a run is included for review in a Notebook entry, we move the runs and results to the project for the runs/results


FAQ

Error trying to connect to database 'warehouse'

Two potential causes:

1) Your client or tool may not be connecting using SSL / TLS.  

To fix this, check that you followed the SSL / TLS setup instructions above. In particular, check that you have sslmode=verify-ca appended to the connection URL.

2) Bash is interpreting the username as a variable.  

To fix, make sure that you're using the correct connection string:

  • Use psql -U 'u$benchlingsupport' (with single quotes) instead of psql -U u$benchlingsupport -- be sure to replace benchlingsupport with your user ID

Error: psql: server certificate for "....rds.amazonaws.com" (and 2 other names) does not match host name postgres-warehouse.yourdeployname.benchling.com

You've tried to connect with the verify-full SSL mode. Either use the verify-ca mode or change the hostname you connect with to the *.rds.amazonaws.com one shown in the error message. The CA verification process ensures that this is a secure connection either way.

Did this answer your question?