Skip to content

Aggregate ingestion errors over reingestion #1379

Closed

Description

Problem

If a DAG has an expected ingestion error, we can use skip_ingestion_errors to cause it to skip over them during ingestion and report them once in aggregate at the end of ingestion.

There is no equivalent configuration for skipping errors during reingestion. Very often when there is an error in a reingestion workflow, it occurs for multiple ingestion days, and a separate Slack notification is received for each one. This can have the effect of flooding the Slack channel and obscuring other alerts.

We should instead aggregate errors that occur over the ingestion days and report them in a single Slack notification at the end of reingestion.

Description

A preliminary idea: we could pass some context in to the ProviderDataIngester so that it knows when it is running as part of a reingestion workflow. Then, if some Airflow configuration variable is set, we report errors via XCOM instead of sending the Slack notification. In report_load_completion, we aggregate and report.

Implementation

  • 🙋 I would be interested in implementing this feature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

✨ goal: improvementImprovement to an existing user-facing feature💻 aspect: codeConcerns the software code in the repository🟨 priority: mediumNot blocking but should be addressed soon🧱 stack: catalogRelated to the catalog and Airflow DAGs

Type

No type

Projects

  • Status

    ✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions