This guide provides an overview of flowcharts within Benchling Workflows. It's primarily intended as a guide to understand our API endpoints, but also provides an overview of flowcharts at Benchling in general since there are a couple core concepts that are important to understand first.
We'll start by defining these core concepts, which will let us build up to how flowcharts work within the Benchling ecosystem.
Workflow task
A workflow task is a concept within Benchling that represents a unit of work. A Benchling task might represent the real-world task of running an assay or checking out some containers.
Each task has fields like an assignee, scheduled-on date, a status, and configurable “Task fields” (e.g. assay results, which container). A task is schematized - its configurable "Task fields” are based on the schema’s configuration. (More on schemas below.)
Workflow output
Optionally, a task may have one or more outputs. This is the data produced by the task. The “Output fields” of this output are also schematized.
Workflow task groups
Tasks are created inside task groups. We might have a task group named "Check out containers", which contains 5 tasks. Each of those tasks has a field denoting the ID of a container.
Said differently, a task group represents the "operation", and each task represents an individual "unit" within that operation.
Instead of having 5 tasks in the example above, we could actually have one task in our task group that just has 5 fields - one for each column. How we choose to model tasks within Benchling is highly dependent on the granularity and auditability required of the workflow.
Schemas
At Benchling, a schematized object has some number of fields, each of which is a standard Benchling type: e.g. a container, entity, string, boolean, number. These types may themselves be schematized (like containers or entities), or they might be primitive (strings, booleans, numbers).
A schema is the configuration of that schematized object. More broadly, a Benchling schema is:
- The specific information for configuring this Benchling object.
- Ex: a container is a Benchling object that is schematized. All containers have a name field, which is a system field and not a custom schema field.
- The custom data types and names of each field for this object
- Ex: we might make a container schema named Refrigerator with a custom string field called Manufacturer. We might make another container schema named Plate with a custom number field called number_of_wells
Once a schema has been configured, we can create multiple instances of that schema. For example, if we had a Benchling entity schema named "Fruit", we might log multiple instances of that schema: apple, orange, pear, etc.
Workflow Task schemas
A Task schema is the configuration of a task. It includes data like the name of the task, how it is executed, and the data fields for this task. Note that as of this writing, all tasks exist within a task group. Keep this in mind when configuring data fields - there can be multiple instances of these data fields within each task group.
The Task schema has an optional output schema - this is a separate object which represents the task’s output. It also has a set of configurable data fields.
Task execution type
Tasks can be executed in a variety of ways, which is configured as part of its schema. The different task execution types aren't covered in depth here, but here are some examples:
- an Entry execution task is copied into a table within a Benchling entry and executed within that entry
- a Direct execution task can be executed in the UI via a modal
- a Flowchart execution task is a task whose execution happens via a flowchart of other workflow tasks
A workflow task’s schema has different fields depending on the selected execution type.
Flowchart
Often, Benchling users need to represent workflows that are more complicated than single units of work. At a high-level, a flowchart represents an overarching workflow comprised of multiple units of work (workflow tasks) and how they depend on each other. (In a mathy/formal sense, a flowchart is a directed cyclic graph of nodes - cyclic because we support loops.)
Flowcharts can be configured as part of the Task schema of a flowchart execution task.
An example of a "Purity assay" flowchart linking three tasks together might be:
(Root Node)
|
[Check out container from storage]
|
[Check Purity]
|
<Router: Purity passed>
/ \
/ \
Yes No
/ \
(Output Node) [Discard]
[]: task node
(): system node
<>: router node
In the above example, we have a flowchart composed of several workflow task nodes in square brackets, along with some non-task nodes. Each task node corresponds to a separate Workflow Task schema.
All flowcharts have a root node. The router and output nodes are added during flowchart configuration.
To execute a flowchart, we’ll create a new task group for our configured flowchart execution Task schema. Each task within this group will independently progress through the flowchart, creating child tasks as it reaches each task node. We’ll go through this flow end-to-end after covering the rest of our core concepts.
Flowchart Execution Task schema
This is the Workflow Task schema representing our overarching workflow. Setting up a Flowchart Execution Task schema involves configuring the following:
- Task fields: These are the schematized data fields for this task. We can think of this as the input to each flowchart task - in the example above they might be the containers on which we'll perform the assay.
- Output fields: The schematized data fields for the output of each flowchart task. In the above example, that might be the containers which passed the purity test.
- Task schemas: This is a list of constituent tasks that our flowchart links together.
-
Transitions: this is how data maps from one schema to another. We might map data from the output schema of one node to the Task schema of another. We could also map directly from the upstream Task schema to the downstream Task schema.
- Note: When configuring transitions, selecting the the flowchart Task schema as a source means we are mapping from the task fields on the flowchart task itself. (We are mapping from the root node.)
- Selecting the flowchart Task schema as the destination means we are mapping to flowchart task’s output field. (We are mapping to the output node.)
-
Flowchart: Finally, we can configure the graph of nodes that represent the flowchart The valid edges are determined by the transitions we configured. This is also where we can add additional nodes like:
- Router nodes: A conditional branch point. A router can evaluate basic logic based on upstream data. Based on the results of these evaluations, one or more downstream edges will be traversed.
- End nodes: these can only branch off of a router node. They indicate that the overall operation should end here.
Note: configuring a flowchart is not required. If there is no template flowchart, then the user will have to create a flowchart themselves when creating the flowchart task group. It is also possible to allow users to modify a flowchart task group's flowchart after creation.
Flowchart task lineage
A common point of confusion is the instance of a flowchart task vs the flowchart itself. In this document, “flowchart' refers to the configuration of the flowchart, i.e. the nodes and edges, whereas a flowchart task refers to the instance of a task that flows through a flowchart.
A flowchart task is just a workflow task with a Flowchart execution type. It's also called a rootTask in the API.
If you are using the API and you want to know how the execution of a specific flowchart task group is progressing, you’ll want to query the task groups within that flowchart, not the schema of the flowchart itself.
Putting it all together: what is a flowchart?
A flowchart is a graph of nodes that represent a multi-step process. Data starts at the root node, and flows along the configured edges as each task is executed. Every task node represents a potential task group, which is composed of multiple tasks. Which edge is traversed can be determined by routers, which are placed along edges.
Let's run through the above flowchart example and how to execute it:
- We'll start by creating a flowchart task group for the schema named "Purity assay". We'll be testing three containers, so we create three tasks. For each task, we have to fill out two schema fields: the container ID and the lot number
- After hitting create, we'll see an execution flowchart indicating 3 in progress tasks at the
(Root node)
. In the background, Benchling will do some work to create a task group at the first downstream node titled[Check out container from storage]
utilizing the transition (aka data mapping) we configured as part of the flowchart Task schema. - In this case, we configured a transition from the
(Root node)
's task fields to the task fields of[Check out container from storage]
. We'll create a new task group with three tasks. Once Benchling completes this mapping, the downstream node is clickable and shows three pending tasks
- We'll execute the
[Check out container from storage]
tasks. For our purposes, this task has no outputs, but each task has a status which we can mark as completed once we've checked the containers out.
- As soon as the first task within
[Checkout container from storage]
is completed, Benchling will create a new task group at the node[Check purity]
with a new task whose data is populated based on the transition we configured between these two schemas. Because the upstream task has no outputs, we configured a mapping directly from the upstream task fields to the downstream task fields. We'll execute all the tasks within the[Checkout container from storage]
task group so that we get 3 tasks at the downstream[Check purity]
task group.
- Let's execute
[Check purity]
as well. This task does have outputs: a container id, lot number, and purity decimal. After executing these 3 tasks we'll produce 3 outputs.
- In order to keep traversing our flowchart, we must now first go through the router. The router checks if the purity is > 0.8. If yes, we will route to the output node, otherwise we will route to the
[Discard]
task node. Let's say we have two tasks that pass this check and one that doesn't.
- Starting with the left-hand edge: we configured a mapping from
[Check purity]
to(Output Node)
that creates outputs for our flowchart task based on the container id, lot number, and purity from the upstream task. We'll create two outputs to the flowchart task. - On the right-hand edge, we configured a mapping from
[Check purity]
to[Discard]
that contains just the container id and lot number. We'll create one task at for this task group whose fields contain that data.
- At this point, two tasks in the flowchart are complete. There is one flowchart task in-progress because it's still waiting at the
[Discard]
node. - Finally we complete the
[Discard]
task, it has no outputs, but we'll set the status to complete. All tasks in this flowchart task group are complete.
API glossary
Explaining the fields returned by the Benchling API that are relevant to navigating flowcharts. See API documentation for full field-level breakdown. See above for core flowchart concepts.
Workflow tasks
-
rootTask
: If present, this is the flowchart task, or the task created at the root node when a flowchart task group is created. All other tasks in a flowchart have the same root task and are not themselves flowchart tasks. -
executionFlowchartId
: Only present if a flowchart task. This is the configuration of the flowchart that this flowchart task is associated with. You can query against theworkflow-flowcharts
endpoint with this ID -
executionType
: this is how one actually executes the task -
sourceTasks
: a list of tasks that were used to create this task. ForNode3
below, sourceTasks would include information aboutNode1
andNode2
:-
[Node1] [Node2]
\ /
[Node3] - This is not a list of upstream nodes to the root of the flowchart.
- This will only include tasks whose data were used to create this task. If an output for an upstream task is used in the mapping instead, that will be available in
sourceOutputs
-
-
sourceOutputs
: seesourceTasks
. This a list of outputs along the immediate upstream edges for this task, that were used to create this task. -
nextTasks
: the inverse ofsourceTasks
. This is a list of immediate downstream tasks that were created using the current task
Workflow outputs
-
workflowTask
: info about the task instance that this output is associated with -
workflowTaskGroup
: info about the task group instance that this output’s task is associated with -
nextTasks
: seesourceTasks
underWorkflow Tasks
. This is a list of tasks that were created from this output
Workflow task groups
-
nodeConfigId
: If this task group is associated with a flowchart task node, the id of that node config. SeeWorkflow Flowcharts
below for more info on node configs. -
flowchartTaskGroups
: Only present if this is the flowchart task group (the root task group). A list of the other task groups that have been created as part of this flowchart task.- Note: task groups are only created when the first task reaches that node.
-
workflowTasks
: the tasks within this flowchart. query theWorkflow Tasks
endpoint for more info- Note: this list can change over time as more tasks added to the group. This can happen if tasks are in the process of getting created, or if a task is manually added ad-hoc
Workflow flowcharts
Note: this is the configuration of a flowchart, not the instance of a flowchart task.
-
nodeConfigs
: Every node in a flowchart has type-specific data. See the description of a flowchart above for more context. These node types are:-
ROOT
: This is the starting node of the flowchart. The Task schema for the flowchart task is serialized. -
OUTPUT
: If present, this is the output node of the flowchart (the output schema associated with the flowchart task). Output schema is serialized. -
TASK
: Any node in the flowchart that represents a Task schema. The schema of that task is serialized. -
ROUTER
: A router node in the flowchart. When an edge passes through a router, the routerFunctions configured as part of that router are evaluated to determine which edge(s) will actually be traversed. The name of the router, its routerFunction and its edge config ID are serialized, but the actual logic of each router function is not available via the API. The default router function is labeled: there is no logic associated with this function, but the edge associated with the default function is traversed if no other router function passed. -
END
: An end node. This signifies that the flowchart task should complete. The name of the end node is serialized.
-
-
edgeConfigs
: some basic info about each edge. It contains a from node config id and to node config id -
migratedFromFlowchartId
: Due to a backend data migration, some older flowcharts were migrated by Benchling to a newer version. This is the ID for the original flowchart. This is surfaced for traceability/audit purposes. -
migratedToFlowchartId
: SeemigratedFromFlowchartId
above. This is the ID of the migrated flowchart.
Workflow flowchart config versions
Configuring a flowchart as part of flowchart Task schema configuration creates a flowchart config version. Updating that flowchart afterwards creates a new config version for that flowchart.
-
templateFlowchart
: If this config version has a template flowchart, that is serialized in the same way as flowcharts are serialized in our API. There may be no template flowchart if the flowchart schema was configured with no flowchart. This means that the user builds the flowchart themselves as part of creating the flowchart task group. If there is no template flowchart, the root task has an executionFlowchartId, which can be used to query a flowchart via the /workflow-flowcharts API