When fitting a regression line against a set of data, outlier data points can negatively impact the calculation of the regression and lead to misleading results.
To use outlier detection on an Insights dashboard block, follow the steps outlined below. Note that outlier detection is subject to error.
Select the “Scatter plot” chart type, and specify the appropriate x- and y-axes
Under the “Analysis” tab, choose “4-parameter logistic” as the regression line type
Note: Outlier detection is currently only available on logistic regressions, not linear regressions
Select either “Detect” or “Detect and exclude” under the “Outliers” dropdown
Detect: Detects outlier data point(s) and marks them with a red X.
Regressions, aggregations, and error values are computed using the full dataset, including any outliers.
Detect and exclude: Detects outlier data point(s) and marks them with a semi-transparent, red X.
These outlier data point(s) are excluded from the regression, aggregation, and error bar calculations, which are automatically re-calculated upon excluding outliers.
Customize the sensitivity of outlier detection by adjusting the slider below the “Outliers” dropdown
Filter out fewer outliers by dragging the slider to the left (i.e., less aggressive detection) or filter out more outliers by dragging the slider to the right (i.e., more aggressive detection)
The algorithm implemented for outlier detection on logistic regressions is based on the ROUT method, and its details can be found in this paper.