Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/cloud/general/security-and-privacy.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ The collected information is detailed in the table below.

You can see all the data that Elementary collects and stores in your local Elementary schema.

In general, Elementary does not collect any raw data. The only exception is the failed rows sample (stored in table `test_results_samples`) which can be disabled.
In general, Elementary does not collect any raw data. The only exception is the failed rows sample (stored in table `test_result_rows`) which can be disabled.
This is an opt-out feature that shows a sample of a few raw failed rows for failed tests, to help users triage and understand the problem.
To avoid this sampling, set the var `test_sample_row_count: 0` in your `dbt_project.yml` (default is 5 sample rows).
To avoid this sampling, set the var `test_sample_row_count: 0` in your `dbt_project.yml` (default is 5 sample rows). You can also disable samples for specific tests, protect PII-tagged tables, and request environment-level controls. See [Test Result Samples](/data-tests/test-result-samples) for all available options.

| Information | Details | Usage |
| ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Expand Down
168 changes: 168 additions & 0 deletions docs/data-tests/test-result-samples.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
---
title: "Test Result Samples"
sidebarTitle: "Test Result Samples"
---

When a test fails, Elementary captures a sample of the failing rows and stores them in the `test_result_rows` table. These samples help you quickly understand and investigate data issues without manually running queries.

By default, Elementary saves **5 sample rows per failed test**.

This page describes all the available controls for managing test result samples -- both self-service configuration in your dbt project and options available through the Elementary team for Cloud users.

## Configuring sample size

### Global setting

Set the number of sample rows saved per failed test across your entire project by adding the `test_sample_row_count` variable to your `dbt_project.yml`:

```yaml
vars:
test_sample_row_count: 10
```

Or pass it as a flag when running dbt:

```shell
dbt test --vars '{"test_sample_row_count": 10}'
```

Set to `0` to disable sample collection entirely:

```yaml
vars:
test_sample_row_count: 0
```

<Warning>
The larger the number of rows you save, the more data you will store in your data warehouse. This can affect the performance and cost of your Elementary schema, depending on your database.
</Warning>

### Per-test override

You can override the global sample size for individual tests using the `test_sample_row_count` meta configuration:

```yaml
models:
- name: orders
data_tests:
- unique:
config:
meta:
test_sample_row_count: 20 # Save more samples for this specific test
- not_null:
column_name: order_id
config:
meta:
test_sample_row_count: 0 # Disable samples for this test
```

The per-test setting takes precedence over the global variable.

## Disabling samples for specific tests

Use the `disable_test_samples` meta configuration to completely disable sample collection for a specific test:

```yaml
models:
- name: user_profiles
data_tests:
- elementary.volume_anomalies:
config:
meta:
disable_test_samples: true
```

## PII protection

Elementary provides built-in protection for sensitive data by automatically disabling test sample collection for tables tagged as PII.

### Enable PII protection

Add these variables to your `dbt_project.yml`:

```yaml
vars:
disable_samples_on_pii_tags: true # Enable PII protection (default: false)
pii_tags: ['pii', 'sensitive'] # Tags that identify PII tables (default: ['pii'])
```

### Tag tables as PII

Tag individual models:

```yaml
models:
- name: customer_data
config:
tags: ['pii']
```

Or tag entire directories:

```yaml
# dbt_project.yml
models:
my_project:
sensitive_data:
+tags: ['pii']
```

PII tag matching is **case-insensitive** -- `PII`, `pii`, and `Pii` are all equivalent.

### Override PII protection for specific tests

If a table is tagged as PII but you want to allow samples for a specific test, you can override:

```yaml
models:
- name: customer_data
config:
tags: ['pii']
data_tests:
- elementary.volume_anomalies:
config:
meta:
disable_test_samples: false # Allow samples despite PII tag
```

## Configuration precedence

When multiple settings apply, Elementary follows this order (highest priority first):

1. **`disable_test_samples` in test meta** -- per-test on/off switch
2. **`test_sample_row_count` in test meta** -- per-test sample size
3. **PII tag detection** -- when `disable_samples_on_pii_tags: true` and the table has a matching tag
4. **`test_sample_row_count` global var** -- project-wide sample size
5. **Default** -- 5 rows

## Elementary Cloud: additional controls

For Elementary Cloud users, there are additional environment-level controls that can be enabled by the Elementary team.

<Note>
The controls below are managed by Elementary and apply to how test samples are handled after they are synced from your data warehouse. To request changes, contact the Elementary team via Slack or email.
</Note>

### Disable test samples for an environment

The Elementary team can disable test samples entirely for a specific environment. When enabled:
- Test samples will **not be synced** from your Elementary schema.
- Test samples will **not appear** in the UI or in alerts, even if they exist in your warehouse.

This is useful for environments that contain highly sensitive data where no sample rows should ever leave the warehouse.

### Skip database storage of sample rows

The Elementary team can configure an environment so that the `test_result_rows` data is stored only in the data lake (S3) and **not loaded into the application database**. This reduces database size while keeping the raw data available for debugging if needed.

## Summary of all controls

| Control | Scope | Where to configure | Default |
| --- | --- | --- | --- |
| `test_sample_row_count` | Global | `dbt_project.yml` vars | `5` |
| `test_sample_row_count` | Per-test | Test meta | Inherits global |
| `disable_test_samples` | Per-test | Test meta | `false` |
| `disable_samples_on_pii_tags` | Global | `dbt_project.yml` vars | `false` |
| `pii_tags` | Global | `dbt_project.yml` vars | `['pii']` |
| Disable samples for environment | Per-environment | Contact Elementary team | Disabled |
| Skip DB storage of sample rows | Per-environment | Contact Elementary team | Disabled |
3 changes: 2 additions & 1 deletion docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,8 @@
"data-tests/dbt/reduce-on-run-end-time"
]
},
"data-tests/dbt/package-models"
"data-tests/dbt/package-models",
"data-tests/test-result-samples"
]
},
{
Expand Down
22 changes: 14 additions & 8 deletions docs/snippets/faq/question-test-results-sample.mdx
Original file line number Diff line number Diff line change
@@ -1,22 +1,28 @@
<Accordion title="Can I see more result samples in the report?">
<Accordion title="Can I control the test result samples?">

Yes you can!
Yes! Elementary saves samples of failed test rows and stores them in the table `test_result_rows`, then displays them in the *Results* tab of the report.

Elementary saves samples of failed test rows and stores them in the table `test_result_rows`, then displays them in the *Results* tab of the report.
By default, Elementary saves **5 rows per test**, but you have several options to control this:

By default, Elementary saves 5 rows per test, but you can change this number by setting the variable `test_sample_row_count` to the number of rows you want to save. For example, to save 10 rows per test, add the following to your `dbt_project.yml` file:
- **Change the sample size** globally or per-test using `test_sample_row_count`
- **Disable samples** for specific tests using `disable_test_samples` in the test meta
- **Protect PII** by automatically disabling samples for tables tagged with sensitive data tags
- **Elementary Cloud users** can also request environment-level controls from the Elementary team

For example, to save 10 rows per test, add the following to your `dbt_project.yml` file:

```yaml
vars:
test_sample_row_count: 10
```

Or use the `--vars` flag when you run `dbt test`:
To disable samples entirely, set it to `0`:

```shell
dbt test --vars '{"test_sample_row_count": 10}'
```yaml
vars:
test_sample_row_count: 0
```

***NOTE***: The larger the number of rows you save, the more data you will store in your database. This can affect the performance and cost, depending on your database.
For the full list of controls including per-test overrides, PII protection, and Cloud options, see [Test Result Samples](/data-tests/test-result-samples).

</Accordion>
Loading