A dbt package that captures metadata about your dbt project and stores it in warehouse tables. This enables Recce to perform cross-environment data validation without requiring local artifact files.
Add to your packages.yml:
packages:
- git: "https://github.com/DataRecce/recce-dbt-package.git"
revision: mainThen run:
dbt depsAdd to your dbt_project.yml:
vars:
# Schema where recce metadata tables will be created
recce_schema: recce_metadata
# Hook to capture metadata after each run
on-run-end:
- "{{ recce.upload_metadata() }}"Note: If
recce_schemais not set, tables will be created in your target schema (e.g.,mainfor DuckDB, or your configured schema for Snowflake/BigQuery).
Run dbt to create the metadata tables and start capturing data:
# First run creates the tables
dbt run -s recce
# Subsequent runs will capture metadata via on-run-end hook
dbt runThe on-run-end hook captures metadata after each dbt run:
- Invocation metadata - Timestamp, dbt version, adapter type, git info, CI context
- Node metadata - All models, sources, seeds, snapshots, exposures, metrics
- Run results - Execution status, timing, and row counts for each node
Column information is queried directly from information_schema when Recce connects to the warehouse, ensuring you always have current schema details.
| Table | Description |
|---|---|
recce_invocations |
Run context - invocation ID, timestamp, dbt version, adapter, git SHA/branch, CI metadata |
recce_nodes_dbt |
All nodes - unique_id, name, resource_type, depends_on, raw_code, checksum |
recce_run_results_dbt |
Run results - status, execution_time, rows_affected, message |
| Variable | Default | Description |
|---|---|---|
recce_schema |
target.schema |
Schema for metadata tables |
recce_database |
target.database |
Database for metadata tables (optional) |
disable_recce_metadata_upload |
false |
Set to true to disable automatic metadata capture |
vars:
recce_schema: recce_metadatavars:
recce_schema: recce_metadata
recce_database: ANALYTICS_DBvars:
disable_recce_metadata_upload: trueThe package automatically detects CI environments and captures relevant metadata:
| CI Platform | Detection | Captured Metadata |
|---|---|---|
| dbt Cloud | DBT_CLOUD_RUN_ID |
run_id, job_id, project_id |
| GitHub Actions | GITHUB_ACTIONS=true |
run_id, run_number, workflow, repository |
| GitLab CI | GITLAB_CI=true |
job_id, pipeline_id, project_path |
| CircleCI | CIRCLECI=true |
build_num, workflow_id, project_reponame |
| Jenkins | JENKINS_URL |
build_number, job_name, build_url |
Git information is captured from environment variables:
GIT_SHA,GITHUB_SHA, orCI_COMMIT_SHAGIT_BRANCH,GITHUB_REF_NAME, orCI_COMMIT_BRANCH
| Warehouse | Status |
|---|---|
| DuckDB | Supported |
| Snowflake | Supported |
| PostgreSQL | Supported |
| BigQuery | Experimental |
| Redshift | Experimental |
After installing this package and running dbt run, Recce can read metadata from the warehouse instead of local artifact files:
# Instead of:
recce server --base-manifest target-base/manifest.json
# Use:
recce server --warehouse-metadataApache 2.0