How do I aggregate data across different organizations?

Aarthi
Aarthi
  • Updated

Aggregating Data Across Organizations

If you want to aggregate data from different organizations, use the UNION ALL operator, which similar to a JOIN, but unlike JOIN, which is the intersection of two tables on a common key on, UNION ALL is a concatenation of data.

SELECT * FROM org1.entry
UNION ALL
SELECT * FROM org2.entry
UNION ALL
SELECT * FROM org3.entry

This query allows you to concatenate 3 SELECT * into 1. The example above is quite simple, and you can easily use in cases where tables have common columns. This is not a problem for operational tables like entry because they are defined by Benchling, so the column names are consistent for every organization.

This query is a bit more challenging in cases where you want to concatenate tables that don’t have the same columns. For example, 2 organizations might use a result schema, but 1 has a “Mass (kg)" field and the other could have a ”Mass (lbs)" field. SQL will not be able to concatenate the 2 columns into 1 because the column names are not consistent. In this case, if you convert data and rename columns headers, you can successfully concatenate data across organizations:

SELECT 'Mass (kg)' FROM (
SELECT 'Mass (lbs)' * 0.453592 as "Mass (kg)"
FROM org1.assay_result
) as normalized_results
UNION ALL
SELECT 'Mass (kg)' FROM org2.assay_result


Additional resources

https://www.w3schools.com/sql/sql_union.asp
https://www.postgresqltutorial.com/postgresql-union/

Was this article helpful?

Have more questions? Submit a request