Skip to content

Dags that are stale during Asset update have inaccurate asset scheduling #59337

@collinmcnulty

Description

@collinmcnulty

Apache Airflow version

2.11.0

If "Other Airflow 2/3 version" selected, which one?

No response

What happened?

A dynamically generated dag that uses Asset scheduling sometimes has a brief issue where it is not parsing correctly and Airflow marks it as stale. During the time that it is stale, one of the Assets that it needs to schedule is updated. When it starts parsing correctly again, it does not run, and instead waits until that Asset is updated again. However, this updated Asset does show in the "Dataset updates that triggered your dag run" section of the UI.

What you think should happen instead?

Instead, the Asset updates should still be counted against the stale dag such that when it is no longer stale, it can know that it can run immediately.

How to reproduce

  • create a dag that uses asset scheduling
  • introduce a crash bug so that the dag goes stale
  • update an asset that the dag uses to schedule
  • remove the crash bug
  • observe that the dag does not start

Operating System

debian bullseye

Versions of Apache Airflow Providers

No response

Deployment

Astronomer

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:Schedulerincluding HA (high availability) schedulerarea:corekind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions