-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
2.4.1
What happened
When using dataset scheduling, it isn't obvious which datasets a downstream dataset consumer is awaiting in order for the DAG to be scheduled.
I would assume that this is supposed to be solved by the Latest Update column in the modal that opens when selecting x of y datasets updated, but it appears that the data isn't being populated.
Although one of the datasets has been produced, there is no data in the Latest Update column of the modal.
In the above example, both datasets have been produced > 1 time.
What you think should happen instead
The Latest Update column should be populated with the latest update timestamp for each dataset required to schedule a downstream, dataset consuming DAG.
Ideally there would be some form of highlighting on the "missing" datasets for quick visual feedback when DAGs have a large number of datasets required for scheduling.
How to reproduce
- Create a DAG (or 2 individual DAGs) that produces 2 datasets
- Produce both datasets
- Then produce only one dataset
- Check the modal by clicking from the home screen on the
x of y datasets updatedbutton.
Operating System
Debian GNU/Linux 11 (bullseye)
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct


