[v3-1-test] Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) (#57968) #58333

github-actions · 2025-11-14T20:51:29Z

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
stuck in deferred state after clearing and re-running.

Root cause:
When reset_dag_run=True is used with a fixed run_id, the database preserves the
original logical_date from the first run. However, on subsequent runs after clearing,
the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
was being created with this newly calculated logical_date, causing a mismatch when
querying the database - the trigger looked for a DAG run with the new logical_date
but the database contained the original logical_date, causing the query to return
zero results indefinitely.

Solution:

Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
globally unique
Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
verify the fix and prevent regression

The fix ensures that both deferrable and non-deferrable modes use identical logic
for determining DAG run completion - querying by run_id and state only, without
filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov mykola.shyshov@gmail.com

…eset_dag_run (#57756) (#57968) When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True, reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently stuck in deferred state after clearing and re-running. Root cause: When reset_dag_run=True is used with a fixed run_id, the database preserves the original logical_date from the first run. However, on subsequent runs after clearing, the operator calculates a NEW logical_date based on the current time. The DagStateTrigger was being created with this newly calculated logical_date, causing a mismatch when querying the database - the trigger looked for a DAG run with the new logical_date but the database contained the original logical_date, causing the query to return zero results indefinitely. Solution: - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and globally unique - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to verify the fix and prevent regression The fix ensures that both deferrable and non-deferrable modes use identical logic for determining DAG run completion - querying by run_id and state only, without filtering by logical_date which can become stale when resets are involved. (cherry picked from commit 4f3d0c5) Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>

…eset_dag_run (#57756) (#57968) (#58333) When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True, reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently stuck in deferred state after clearing and re-running. Root cause: When reset_dag_run=True is used with a fixed run_id, the database preserves the original logical_date from the first run. However, on subsequent runs after clearing, the operator calculates a NEW logical_date based on the current time. The DagStateTrigger was being created with this newly calculated logical_date, causing a mismatch when querying the database - the trigger looked for a DAG run with the new logical_date but the database contained the original logical_date, causing the query to return zero results indefinitely. Solution: - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and globally unique - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to verify the fix and prevent regression The fix ensures that both deferrable and non-deferrable modes use identical logic for determining DAG run completion - querying by run_id and state only, without filtering by logical_date which can become stale when resets are involved. (cherry picked from commit 4f3d0c5) Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>

github-actions bot mentioned this pull request Nov 14, 2025

Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) #57968

Merged

boring-cyborg bot added the area:task-sdk label Nov 14, 2025

jscheffl marked this pull request as ready for review November 14, 2025 20:52

jscheffl requested review from amoghrajesh, ashb and kaxil as code owners November 14, 2025 20:52

jscheffl merged commit da4a5ea into v3-1-test Nov 14, 2025
58 checks passed

jscheffl deleted the backport-4f3d0c5-v3-1-test branch November 14, 2025 21:51

ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Dec 1, 2025

ephraimbuddy added this to the Airflow 3.1.4 milestone Dec 2, 2025

ephraimbuddy mentioned this pull request Dec 4, 2025

Status of testing of Apache Airflow 3.1.4rc2 and Task SDK 1.1.4rc2 #59033

Closed

78 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v3-1-test] Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) (#57968) #58333

[v3-1-test] Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) (#57968) #58333

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[v3-1-test] Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) (#57968) #58333

[v3-1-test] Fix: TriggerDagRunOperator stuck in deferred state with reset_dag_run (#57756) (#57968) #58333

Uh oh!

Conversation

github-actions bot commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants