DuckDB Rollout

Raven
Raven
  • Updated

Overview 

Benchling has upgraded to a newer version of DuckDb, which powers data transforms within Analysis. A small set of transforms may produce outputs that differ slightly in appearance from the previous version.

 

What does this mean for you?

The outputs of the following transforms differ slightly from the previous version:

  • Window functions: Computed values from window functions such as Row number, Lead, Lag, First, and Last may differ
  • Aggregate table transforms: Outputs from the Text concat aggregation function may appear in a different order
  • Bin data transforms: When binning by a date column, all buckets are now evenly sized
  • Pivot table transforms: Pivoted column display names now include single quotes around values that contain special characters, whitespace, empty strings, or null
  • Find and replace text transforms: Replacement strings that use lookahead, lookbehind, or a trailing backslash now fail with an error. Additionally, when a replacement string references a capture group that does not exist in the search pattern, the original value is unchanged
  • Computed column SPLIT and SUBSTITUTE functions: When a SPLIT or SUBSTITUTE formula operates on a column containing list-type values, the string representation of individual list elements has changed. Elements that previously appeared with single quotes now appear with double quotes. Empty string elements that were previously omitted from the output are now included and quoted
  • Computed column MODE function: When the MODE function is applied to a boolean column, the return type of the output column has changed from Text to Boolean
  • Computed columns and column format transforms that cast text columns in scientific notation to integers: When text columns contain values in decimal scientific notation, the text values are now casted directly to integers without rounding
  • Computed columns and column format transforms that convert lists to strings: When a list is converted to a string, such as with a SUBSTITUTE function, list items containing empty strings or special characters are now included and quoted in the string
  • Computed columns that divide by zero: Computed columns that divide by zero will evaluate to inf. In the previous version, they evaluated to null
  • Computed column DATECONVERT function: Dates and times with milliseconds now have zero-padded milliseconds on the left
  • Display names on transforms that produce duplicate column names: Column display names may appear with a colon and a number (:1) instead of an underscore and a number (_1) to disambiguate columns on transforms that result in duplicate column names

These changes only impact Analyses that specifically depend on the changed outputs of the transforms listed above.

 

Contact Us

If you have any questions or concerns about this transition, please don’t hesitate to contact our support team at support@benchling.com or reach out to your sales representative for more details.

Was this article helpful?

Have more questions? Submit a request