Aggregating Data Across Organizations
If you want to aggregate data from different organizations, use the UNION ALL
operator, which similar to a JOIN
, but unlike JOIN
, which is the intersection of two tables on a common key on
, UNION ALL
is a concatenation of data.
SELECT * FROM org1.entry
UNION ALL
SELECT * FROM org2.entry
UNION ALL
SELECT * FROM org3.entry
This query allows you to concatenate 3 SELECT *
into 1. The example above is quite simple, and you can easily use in cases where tables have common columns. This is not a problem for operational tables like entry
because they are defined by Benchling, so the column names are consistent for every organization.
This query is a bit more challenging in cases where you want to concatenate tables that don’t have the same columns. For example, 2 organizations might use a result schema, but 1 has a “Mass (kg)" field and the other could have a ”Mass (lbs)" field. SQL will not be able to concatenate the 2 columns into 1 because the column names are not consistent. In this case, if you convert data and rename columns headers, you can successfully concatenate data across organizations:
SELECT 'Mass (kg)' FROM (
SELECT 'Mass (lbs)' * 0.453592 as "Mass (kg)"
FROM org1.assay_result
) as normalized_results
UNION ALL
SELECT 'Mass (kg)' FROM org2.assay_result
Additional resources
https://www.w3schools.com/sql/sql_union.asp
https://www.postgresqltutorial.com/postgresql-union/