Weekly AWS s3 backup service

Raksha
Raksha
  • Updated

Benchling can configure weekly exports of a Benchling tenant's data to an AWS s3 bucket owned by a customer. Once configured, the customer's entire Benchling data will be copied to the designated s3 bucket every Saturday. Benchling is not responsible for maintenance of this data within AWS. 

Set up weekly s3 backups

To request weekly s3 backups, contact your Benchling Customer Success representative or our Support team at support@benchling.com and ensure you:

  1. Create an AWS account.
  2. Create the buckets to store backups.
  3. Share the AWS account ID and name of your s3 bucket in AWS with your representative.
  4. Grant Benchling's AWS account access to your bucket.

Tip: Your AWS account ID is a long number. For example, 371589656068. You can find it at the top of your console homepage.

Recommendations for creating your AWS account and buckets

When creating your AWS account and buckets, we recommend:

  • Naming the bucket yourname-benchling-backups or something similar. Note that AWS bucket names are global across all AWS customers and regions.
  • Placing it in a region close to you, for improved download times. The most popular regions to use are:
    • us-west-2 (Oregon USA)
    • us-east-1 (Virginia USA)
    • eu-central-1 (Frankfurt)

Grant Benchling's AWS account access

Grant Benchling's AWS account access to your bucket and attach the following policy to the destination bucket, ensuring you replace YOUR_BUCKET_NAME with the destination bucket name.

You can attach policies in the S3 management console in the Permissions panel and opening Bucket Policy.

{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "AllowBenchlingAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::371589656068:root"
},
"Action": [
"s3:PutObject"
// Following may be required for audit, DB, S3 backups
// "s3:GetObject",
// "s3:AbortMultipartUpload",
// "s3:GetObjectTagging",
// "s3:PutObjectTagging",
// "s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::YOUR_BUCKET_NAME",
"arn:aws:s3:::YOUR_BUCKET_NAME/*"
]
}
]
}

After sharing your bucket name and AWS account ID, we'll configure the backup system to write to the bucket. Backups happen once a week, on Saturday.

What data is included in the export?

S3 URL structure

s3://your-bucket/backup-path/human/YYYY/MM/DD/project_id/file_id/index.extension

Data included

Projects

  • Each project will be exported as a prefix (folder) in AWS
  • Virtual projects or folders where only entities are stored are also exported

Entities

  • Metadata fields exported as key/value pairs in a .txt file
  • Description exported as a PDF
  • Archived entities included
  • DNA sequences exported as GenBank file (.gb)
  • AA sequences exported as a FASTA file (.fasta)

Notebook Entries

  • either HTML or PDF
  • Attached protocols and sequences included within the .pdf
  • Review history embedded in the entry IF the configuration is enabled
  • Metadata exported as either a separate .csv file or within the entry PDF
  • Note: Metadata ONLY includes metadata tracked in Entry Schema fields
  • Attached files (e.g. images, Excel files) exported in the same folder as separate files
  • archived entries included with [Archived] added to the title

Data NOT included

Projects and Folders

  • Empty and Archived projects
  • Folder hierarchies within projects

Entities

  • Non-field metadata (author, folder, etc)
  • Attachment files in metadata fields
  • DNA Sequences: Visualizations (linear map, plasmid map, etc.)
  • AA Sequences: Biochemical properties

Notebook Entries

  • EXP ID, Date Created, Author Name, etc. This data must be queried with a custom Insights query
  • All schema configurations
  • Drop-down menu options
  • Entry templates
  • Workflows, Inventory, Requests, Insights, and Results that were not captured in Notebook entries

Export options

Please inform your Benchling representative your preference for the following options of file type of entries in the export and other information to optionally include

  • Notebook entries can be exported as either .html or .pdf
  • Notebook entry audit logs can be included as a .csv file with the entry export
  • For .pdf entries, file attachments in each entry can be included inline within each entry .pdf, in addition to being exported separately
  • For .pdf entries, the sequence (Genbank text) of @-mentioned DNA sequences can be embedded within the entry .pdf
  • An entry's schema fields can be included as a separate .csv file with the entry export
  • An entry's schema fields can be included within the entry .pdf as a metadata page 
  • A summary table of the entry's review history can be included at the top of each entry file 
    • The comments associated with each review state change can be included in the review history table
  • The attachments submitted in any structured results tables can be included as separate files in the export

How Benchling protects customer data

Benchling is designed with high availability in mind and data is stored with high redundancy. Our AWS-hosted databases are synchronously replicated across datacenters and files are stored in Amazon S3, which stores files across multiple datacenters and is designed for 99.999999999% durability (11 9s). All data is stored encrypted at rest.

As an additional measure, Benchling performs several layers of backups to ensure data recovery if necessary. Database changelogs are captured along with daily snapshots to allow Benchling engineers to perform point-in-time recovery for any time in the last 35 days. Weekly snapshots are also captured and stored cross region in the event of large scale disasters.

Access to customer data is strictly controlled and audited. Your data is stored on production networks separate from Benchling employees, and access to customer data requires administrative approval and is granted only on an as-needed, temporary basis.

Was this article helpful?

Have more questions? Submit a request