Skip to content

Conversation

@github-actions
Copy link

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
stuck in deferred state after clearing and re-running.

Root cause:
When reset_dag_run=True is used with a fixed run_id, the database preserves the
original logical_date from the first run. However, on subsequent runs after clearing,
the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
was being created with this newly calculated logical_date, causing a mismatch when
querying the database - the trigger looked for a DAG run with the new logical_date
but the database contained the original logical_date, causing the query to return
zero results indefinitely.

Solution:

  • Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  • Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

The fix ensures that both deferrable and non-deferrable modes use identical logic
for determining DAG run completion - querying by run_id and state only, without
filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov mykola.shyshov@gmail.com

…eset_dag_run (#57756) (#57968)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
@jscheffl jscheffl marked this pull request as ready for review November 14, 2025 20:52
@jscheffl jscheffl merged commit da4a5ea into v3-1-test Nov 14, 2025
58 checks passed
@jscheffl jscheffl deleted the backport-4f3d0c5-v3-1-test branch November 14, 2025 21:51
ephraimbuddy pushed a commit that referenced this pull request Nov 18, 2025
…eset_dag_run (#57756) (#57968) (#58333)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
ephraimbuddy pushed a commit that referenced this pull request Nov 19, 2025
…eset_dag_run (#57756) (#57968) (#58333)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
ephraimbuddy pushed a commit that referenced this pull request Nov 19, 2025
…eset_dag_run (#57756) (#57968) (#58333)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
ephraimbuddy pushed a commit that referenced this pull request Nov 20, 2025
…eset_dag_run (#57756) (#57968) (#58333)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Dec 1, 2025
@ephraimbuddy ephraimbuddy added this to the Airflow 3.1.4 milestone Dec 2, 2025
ephraimbuddy pushed a commit that referenced this pull request Dec 3, 2025
…eset_dag_run (#57756) (#57968) (#58333)

When TriggerDagRunOperator is used with deferrable=True, wait_for_completion=True,
  reset_dag_run=True, and a fixed trigger_run_id, the operator becomes permanently
  stuck in deferred state after clearing and re-running.

  Root cause:
  When reset_dag_run=True is used with a fixed run_id, the database preserves the
  original logical_date from the first run. However, on subsequent runs after clearing,
  the operator calculates a NEW logical_date based on the current time. The DagStateTrigger
  was being created with this newly calculated logical_date, causing a mismatch when
  querying the database - the trigger looked for a DAG run with the new logical_date
  but the database contained the original logical_date, causing the query to return
  zero results indefinitely.

  Solution:
  - Modified _handle_trigger_dag_run() in task_runner.py to pass execution_dates=None
    to DagStateTrigger when run_ids is provided, since run_id alone is sufficient and
    globally unique
  - Added test test_handle_trigger_dag_run_deferred_with_reset_uses_run_id_only to
    verify the fix and prevent regression

  The fix ensures that both deferrable and non-deferrable modes use identical logic
  for determining DAG run completion - querying by run_id and state only, without
  filtering by logical_date which can become stale when resets are involved.
(cherry picked from commit 4f3d0c5)

Co-authored-by: Mykola Shyshov <mykola.shyshov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-sdk type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants