Out of the Box Transforms

Kyle
Kyle
  • Updated

Purpose

This article serves to guide you through the different "out of the box" transforms steps within Connect. It will be useful for not just non-ASM Data Connectors, but to facilitate integrations of Benchling with analytical instruments, liquid handlers, general automation needs. (we can even use it on top of ASM connectors)

Note: If the file has already been completely formatted, you do not need any of these steps.

Introduction to Transforms 

In order to transform your file into a format that Benchling can process, you can choose to apply transform steps, before the rest of Connect actions will be performed. Any number of transform steps can be added.  

A Transform will:

  • Take a DataFrame as input
  • Modify the Dataframe to the pre-configured instructions (applicable transform step)
  • Save the final state of the DataFrame to a csv

Note: Transforms can be applied to both input AND output file configurations.

To add a Transform step, click on “Set transforms” under “Step 3: Apply transforms to the output file”, or “Step 4: Apply transforms to the input file” You can then select the type of Connect Transform that you need to apply. The following sections describe the types of Connect Transforms that Benchling offers out of the box.

 

 

  • Note: Transform steps will be executed in order. This means that it is vitally important to think through the order of operations that you need to leverage your Transform steps in Benchling to get the desired columns/values. See example below for a final product of multiple transform steps:

 

  • Note: You can upload your original file into the Preview modal so that you are able to track how each transform formats your file! 

 

1. Add Column

The add column transform step will add a new column to the DataFrame with a constant value. You will be prompted to enter:

  1. Column name - The name of the new column
  2. Column value - The constant value that will be added to every row of the DataFrame

2. Arithmetic

  • The arithmetic transform step will add a column with the result of arithmetic between two columns, or one column and a constant. You will be prompted to enter:
    1. Operator - Select whether to add, subtract, multiply, or divide 2 columns
    2. New column name - The name of the new column to be added to the DataFrame.
    3. Value 1/Value 2 Input Type - Determines which value is coming from an existing column in the DataFrame, or if we are adding a new constant value to calculate against.
      1. If you select Constant, you will be prompted to specify the constant value to be used.
      2. If you select Column, you will be prompted to enter the Column name in the DataFrame that we must map to. 

 

3. Duplicate

  • The duplicate transform step will duplicate a specified column and append it to the end of the DataFrame. You will be prompted to enter:
    1. Column to duplicate - The Column name in the DataFrame that we must map to.
    2. New column name - The name of the new duplicate column that will be added to the DataFrame.

 

4. Extra Footers

The Extra Footers transform step allows you to remove rows from the bottom of the file. To remove Extra footers:

  1. Select “Extra Footers” as the Type
  2. Enter the Number of rows to remove

 

5. Extra Headers

  • Use the Extra Headers transform to remove extra rows above the data header row of your csv, or if there are multiple header rows that should be concatenated. When applying Extra Headers, you will be prompted to enter:
    1. The row number of the column header
    2. The row number of the last row containing the column header. 
      • If the column headers are contained in a single row, this field is not necessary. 
      • This number must be greater than the row number of the column header.
      • If this value is greater than the row number of the column header, then all the values in the rows in between these rows will be concatenated, with a space between each input row value, in the first row of each column outputted by this transform.  

 

6. Filter

  • The Filter transform step is used to discard all rows that do not meet the specified filter criteria. This step is used to remove certain rows out of the file to ensure they are not processed in the Connect run. You will be prompted to enter:
    1. Column name - The column in the DataFrame to filter on
    2. Filter type - How to filter the rows out of the DataFrame.
    3. Input type - Determines if the filter criteria is a constant static value, or dynamically updated via lookup
    4. Value/Lookup - The value/lookup function used to determine the filter criteria

 

 

 

7. Find Replace

  • The Find Replace transform step will search the DataFrame for a specified value, and replace it with a new specified value. You will be prompted to enter:
    1. Find any value that… - This is the filter type used for your specified value.
    2. Find type - This determines if the filter type set in the above field will search for a static value, or a dynamic value via Lookup.
    3. Value - This will be the particular filter criteria that you are searching for in the DataFrame (You can use Regex here as well!)
    4. Within - This will be a comma-separated list of columns in the DataFrame that used for the Find Replace function
    5. Replace type - This determines if we will be replacing the found value with a static value, or a value from a lookup
      1. Note: Lookup can only be used once (either as the find type or the replace type) in a single transform step. This means you cannot currently use 2 lookup functions in the same Find Replace step.
    6. Value - This will be the new value that will be used to replace the found value

 7. Merge

 

  • The Merge transform step merges two or more columns together, with data separated by some specified delimiter. The new column will appear at the same position as the first merged column. You will be prompted to enter:
    1. Columns to merge - This will be a comma-separated list of columns in the DataFrame that you plan to merge
    2. Delimiter - This will be the delimiter used to separate values (such as a comma or a semicolon).
      1. Note: Leaving this field blank will merge the values without a separating character
    3. New column name - This will be the name of the newly merged column

 

8. Melt

  • The Melt transform step unpivots a dataframe from wide to long format. You will be prompted to enter:
    1. Identifier column(s) - This will be a comma-separated list of columns in the DataFrame that identify the variable/value combinations
    2. Value columns - This will be a comma-separated list of columns in the DataFrame that we hope to un-pivot. Leave this field blank to use all columns in the DataFrame.
    3. New variable column name - This will be the new name of the column that will hold your variables.
    4. New value column name - This will be the new name of the column that will hold your values.
  • Since this format can sometimes be a little confusing to understand what it means to “un-pivot” a table, take a look at the example starting table below.
    1. Input Table
Sample ID Glucose Glumat Glutamin NH3
5001 6.12 0.52 1.11 1.56

 

  1. But, maybe we want to produce a table that takes each column and converts the measurement/value to its own row, and converts it into a table such as below:
Sample ID Measurement Result
5001 Glucose 6.12
5001 Glumat 0.52
5001 Glutamin 1.11
5001 NH3 1.56

 

  1. In order to do this, we would write the following Melt transform step for this happen:

Screenshot 2024-08-29 at 3.59.10 PM.png

9. Pivot

  • The Pivot transform step pivots data against a list of index (row) columns, a key column, and a value column. All other columns are removed. You will be prompted to enter:
    1. Index (row) column name(s) - This column will be a comma-separated list to use as the index for the pivot. Each unique value in these columns will be converted into a row.
    2. Key column name - This column will become the column headers for the pivoted table
    3. Value column name - This column will populate the values for each column in the pivoted table
  • Since this format can sometimes be a little confusing to understand what it means to “pivot” a table, take a look at the example starting table below.
    1. Input Table
Sample ID Measurement Result
5001 Glucose 6.12
5001 Glumat 0.52
5001 Glutamin 1.11
5001 NH3 1.56



  1. But, maybe we want to produce a table that consolidates all values for sample ID 5001 and convert it to a single row:

 

Sample ID Glucose Glumat Glutamin NH3
5001 6.12 0.52 1.11 1.56

 

  1. In order to do this, we would write the following Pivot transform step for this happen:

Screenshot 2024-08-29 at 4.01.35 PM.png

 

10. Plate Shape

  • The Plate Shape transform step converts data in a plate shape into a two column format (well coordinates and values). You will be prompted to enter:
    1. Plate schema - This will be the specific plate shape this transform step is looking to use
    2. Index of A1 cell: This is the exact cell in the DataFrame that corresponds to the value of well A1 in the plate
  • For example, if your DataFrame had the following format:

And you wanted the table to be processed into a Benchling-ingestible format like below:

 

  • You would need to use the following schema setup in order to get this layout to work as expected:

Screenshot 2024-08-29 at 4.34.42 PM.png

 

11. Remove Quotes

  • The Remove Quotes transform step will remove quotes that enclose an entire cell. This is the only transform step that will not prompt you to enter any additional fields, and will simply remove the front and end quotes from all cells in the DataFrame table.

 

 

12. Split

  • The Split transform step will split an existing DataFrame column into 2 new columns by a specific delimiting character. You will be prompted to enter:
    1. New column names - These will be the names of the 2 new columns that will be generated
    2. Delimiter - This will be the delimiting character that the existing column will be separated by
    3. Column to split - This will be the name of the existing DataFrame column for this step to parse

 

13. Custom

  • The Custom transform step will leverage a custom-built Benchling application to run any custom transformations required for the DataFrame. You will be prompted to enter:
    1. Benchling App ID - This is the identifier of the custom-built Benchling App used to process the DataFrame

 

  • Note: If you are using a custom transform step in your Connect run, it MUST be the first transform step that is added to the run. It can be followed by any number of the standard Transform steps, but the custom step must be the first.
  • Please refer to this doc for more information on creating Custom Transform steps using Connect and Benchling Apps.

Was this article helpful?

Have more questions? Submit a request