Benchling supports working with amino acid (AA) sequences, representing protein chains. In Benchling, take advantage of tools for creating, sharing, and analyzing AA sequences, as well as predicting AA sequence structures. This guide will help you effectively handle your AA sequence data, from import to analysis.
Create single AA sequences
Benchling can support importing AA sequences in several ways. Users can upload supported file types (FASTA, GenPept, Uniprot XML, and Vector NTI .pa4 files. GenPept and/or UniProt XML files can be uploaded to keep track of annotations) or import from a database using accession numbers. This method is recommended for importing a small number of AA sequences.
Import AA sequences into Benchling:
- Click Global Create from the main Benchling tool bar, navigate to AA Sequence, and choose Import AA sequences from the dropdown menu
- Select the folder and project on Benchling where you would like your AA sequence to be saved
- Drag and drop a file or click the Choose files button to locate the file on your computer
- To upload AA sequences directly from an external database, navigate to the Import From Database tab and type in the accession number of the AA sequence
Create multiple AA sequences from spreadsheet
Benchling allows for the efficient creation and registration of many amino acid sequences at once. This feature is particularly useful for larger datasets. Upload files, as above, when your goal is to upload many sequences and then register them later. However, if you want to bring in registry metadata and fields as you create the sequence entities, you can do this through spreadsheet upload then “reimporting” can be used to bring in the sequence features (ex annotations) in bulk.
Prepare a spreadsheet for AA import
For a successful import, the spreadsheet should be formatted as outlined below:
The spreadsheet must contain the following columns:
- Name (Required) - The values in the column need to exactly match the name of your AA sequence files from Snapgene, Geneious, or other software
- Amino Acids (Required) - If you are hoping to import annotations and other sequence features, keep this blank for now, the sequence maps can be reimported in a later step
-
Click Global Create and navigate to AA sequence, select Import AA sequences from spreadsheet from the menu
- Select a project folder where you want the sequences to be imported and click Next
- Now add your spreadsheet, you can either import the full spreadsheet or copy and paste the fields from your spreadsheet template into the raw text box. Ensure that you include the column names
-
Once this is done, select Next
- In the next menu ensure that the column names pulled from your spreadsheet match the column type fields. Once they're matched, press Next
- Benchling will now check for errors. If no errors occur, press Import
-
If you encounter any errors click Back to return to the previous page and make changes
-
- Your entities will now be viewable in the Registry
Reimport
Reimporting is useful in cases where you are trying to bring in sequence features (such as annotations, etc.) for sequences that have already been imported and registered in Benchling.
-
Within the Registry click the local create button and select Reimport AA Sequences
- In this menu, you can now drag and drop in the sequence files, and the file name must be identical to the entity name in Benchling to be matched with the entities you created earlier
- Once you add all the files, select Reimport
-
Each sequence file should contain only one sequence per file, as Benchling will generate an error if multiple sequences are found in a single sequence file
-
Now that you have imported your sequences you can analyze various information about them.
Analyze AA sequence properties
Benchling provides tools to analyze various properties of your AA sequences, aiding in characterization and research.
To analyze AA sequence properties:
- Open the AA sequence of interest
- Navigate to the Biochemical Properties tab within the sequence viewer
-
View the automatically calculated characteristics
- If you highlight a specific portion of the sequence on the sequence map then this tab will show the calculated information for just that portion
Flexible visualizations on the sequence map can also provide you with additional information. You can change the coloring scheme of the AA sequence, and toggle on secondary graphs, like hydrophobicity and isoelectric point.
To access these display features click on the Gear icon in the upper right hand corner of the sequence map:
- Toggle on the features you’d like displayed
- Some visualization settings come with secondary settings as indicated by a gear icon within the toggle menu
- To change the coloring scheme click the secondary Gear icon next to coloring scheme A new window will pop up where you can use the dropdown to decide the coloring scheme you would like
- The default color scheme in Benchling is RasMol; however, Benchling supports four color schemes:
- RasMol: This coloring scheme is traditional and groups amino acids by property. In general, polar residues are brighter colors, while non-polar residues appear more subdued
- Hydrophobicity: The residues are colored on a spectrum from red to blue, where red means hydrophobic and blue means hydrophilic
- Polarity: Colors indicate specific polarities:
- Red: polar acidic
- Yellow: non-polar
- Green: polar uncharged
- Blue: polar basic
- None: You can also turn off the coloring scheme and have the residues displayed in black and white
-
Click Save to apply the new coloring schema to the sequence
Create back translations
If you have an AA sequence that you need to generate a DNA sequence for, you can leverage Benchling to generate a back translated sequence that matches your desired constraints. To create a back translation, follow the steps below:
- Select the region of the AA sequence you wish to back translate and right click on the highlighted region
- From the menu that opens, click Back translate, this will open the codon optimization tab
- If you don’t need to set any parameters for optimization, click Preview optimization
- To save the sequence, click Save as new sequence
- Select the project and folder that you want to save the sequence in and click Select
For more information about the parameters you can set in the Codon optimization tab and how to set them, review the linked article.
Share and export AA sequences
Export AA sequences
You can export AA sequences from Benchling for external analysis or record-keeping.
To export an individual AA sequence:
- Open the AA sequence file of interest in Benchling:
- Click on the Information icon in the molecular biology toolbar
- Under "Export Data," choose whether you would like to export the AA sequence in the FASTA (.fasta) or GenPept (.gp) format
-
Click Download
Share sequences
Collaborate effectively by sharing sequence files with colleagues or external partners directly from Benchling.
To share a sequence file:
- Click on the sequence file that you’d like to share, click the Share button on the top right corner, and copy the Read-only access link.
- Send the link to the person you wish to share this sequence with.
- Anyone with the link will now be able to view your sequence (provided they have an account on the tenant), but will not be able to edit or analyze the sequence
-
You can also turn off link sharing by clicking the Turn off link sharing button.
Predict AA sequence structures with AlphaFold2, Chai-1, and Boltz-2
Benchling has pre-built integrations of multiple structure prediction models, including AlphaFold2, Chai-1, and Boltz-2, providing valuable insights into the function and behavior of proteins. Scientists can leverage these tools in their scientific workflows within the Molecular Biology application. In order to have 3D Structure Prediction enabled on your tenant, reach out to Benchling support at support@benchling.com.
To predict an AA sequence structure:
- Open the AA sequence for which you want to predict the structure
- Navigate to the 3D Structure tab of the AA Sequence
- Choose the model you would like to use: AlphaFold2, Chai-1, or Boltz-2
- MSAs (multi-sequence alignments) increase accuracy by providing evolutionary context to the structure prediction model, but will add time to the prediction. AlphaFold2 uses an MSA. Support for MSAs for Chai-1 and Boltz-2 coming soon.
- If you choose Chai-1 or Boltz-2, an option to choose additional AA sequences will appear. This allows for multimer predictions of protein-protein complexes, antibodies, and more. Use the selector to make copies of your current protein or add any other AA sequence saved on Benchling.
- Click Predict 3D structure
- It will take some time for the prediction to be run, and an email will be sent when it is complete
After submitting your request, the prediction will be processed. Benchling will send you an email notification when it is completed with a link to the prediction in Benchling. In the event that a prediction fails, you will be notified by email as well.
To view and interact with the predicted structure:
- Navigate back to the 3D Structure tab
- If you highlight a region of the AA sequence, you can see the corresponding region highlighted in lime green in the structure
-
You can also select Download to download the PDB file or click Share to generate a share link to your AA Sequence and its predicted structure
- If you change the residues in your AA sequence, your structure will be out of sync and a yellow banner will appear
-
To replace the existing structure with a new predicted structure, click Redo structure prediction. By redoing the prediction, you’ll lose the out of sync structure, so we recommend that you download the original structure if you’d like to compare between versions
Interact with predictions
Structures are rendered in Mol* Viewer; see this this Mol* documentation to learn more about interacting with the structure. Benchling returns only the top ranked structure (“rank 0”) predicted by the model.
Predictions are returned with pLDDT scores, which convey the model’s confidence. Regions of the structure are color-coded according to the model confidence; Benchling displays the same scoring and coloring as the AlphaFold database
3D Structure Prediction access
- All customers may enable the AlphaFold feature on one tenant
- All users on a tenant where AlphaFold is enabled will have the ability to request AlphaFold predictions
- AlphaFold is not available for Benchling Validated Cloud
- You may only request structures if you have WRITE access to the sequence
Limitations on predictions
- Benchling supports predictions of AA sequences up to 1500 residues
- If your request exceeds 1500 residues, a caution banner will display
- A Benchling tenant will be limited to 100 predictions per month
- A single Academic user is limited to 100 predictions per year
- Note: that only verified academic users (users whose accounts are associated with institution email addresses and whose email addresses have been verified) will be able to access AlphaFold
- On the 3D structure tab, users can see how many prediction requests they or their tenant have remaining
Generate AA Sequences with BoltzGen
BoltzGen is a new generative model for designing protein and peptides of any modality to bind a wide range of biomolecular targets. Currently, an early preview of BoltzGen is available on AA sequences, using your AA sequence as a target. In order to have Generative Models enabled on your tenant, reach out to Benchling support at support@benchling.com.
BoltzGen in Benchling lets you design and visualize binders of customizable length against any AA Sequence. Soon, we will also support the full suite of functionality, including additional target modalities and the ability to save generated structures back to Benchling.
- Navigate to any AA sequence, and click the Generative Models tab.
- If you don't see the tab, check the three dot overflow menu, as the tab may be hidden:
- If you don't see the tab, check the three dot overflow menu, as the tab may be hidden:
- Enter the minimum and maximum length of binders you would like to generate.
-
Using your current AA sequence as the target, BoltzGen will generate new sequences between the provided min and max lengths.
-
Preview the generated structures with the viewer, or click Download to see them all.
Limitations on Generative Models
At this time, the visual output of BoltzGen is limited to the top ranked binder output. Using the Download button, additional structures will be downloaded in .cif format for visualization. These .cif files can be uploaded to an AA Sequence's Description tab or an Entry for further visualization. Additionally, the current implementation of BoltzGen is currently limited to specifying AA sequences as targets.