-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Apache Airflow version
2.7.0
What happened
Relates to investigation of #30327
When a DAG with a timetable with schedule=EventsTimetable() and catchup=False is defined, all planned events are executed even if they are in the past.
Note: If this is not a bug but a feature, the docs should mention it that the EventsTimetable ignores the catchup flag. But docs do not tell about this.
What you think should happen instead
With catchup=False I would have expected no past events are scheduled or maximum the one matching the last occurrence.
How to reproduce
Setup a test DAG with an event time table like the following example and enable scheduling. Wait a moment and all past event dates are executed.
with DAG(
dag_id="after_workday_events_regression",
start_date=pendulum.datetime(2023, 8, 1, tz="UTC"),
catchup=False,
schedule=EventsTimetable(
event_dates=[
pendulum.datetime(2023, 8, 1, 1, 0, tz="America/Chicago"),
pendulum.datetime(2023, 8, 2, 1, 0, tz="America/Chicago"),
pendulum.datetime(2023, 8, 3, 1, 0, tz="America/Chicago"),
],
description="My Team's Baseball Games",
restrict_to_events=False,
),
params={"test": 123},
):
@task
def test(ti: TaskInstance=None):
print(ti.execution_date)
test()
Operating System
Ubuntu 20.04 / Breeze Dev setup in Py 3.8 Container
Versions of Apache Airflow Providers
not relevant.
Deployment
Other
Deployment details
Started and tested with latest main branch and breeze.
Anything else
I tried to find the root cause but was not able to locate it. Suspect it is rooted in scheduler_job_runner.py.
I'd supply a fix but was not able to fund the root cause. Expert knowledge is needed probably.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct