Fix missing dag_id in get_task_instance (#64957) by holmuk · Pull Request #64968 · apache/airflow

holmuk · 2026-04-09T16:38:28Z

TaskInstance uniqueness is defined by the composite key (dag_id, task_id, run_id, map_index) (see the SQLAlchemy model):

airflow/airflow-core/src/airflow/models/taskinstance.py

Lines 591 to 592 in f4f48b9

    
           UniqueConstraint("dag_id", "task_id", "run_id", "map_index", name="task_instance_composite_key"), 
        
           ForeignKeyConstraint(

The problem is that the filter in get_task_instance does not include dag_id, violating this invariant. This PR fixes the issue and adds a regression test (test_get_task_instance_disambiguates_by_dag_id_and_run_id) that checks that TIs with different dag_id or run_id can be fetched without exceptions.

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

boring-cyborg · 2026-04-09T16:38:32Z

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
Be sure to read the Airflow Coding style.
Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack

Dev-iL

Correctness: Verified. All 4 existing callers of TaskInstance.get_task_instance() already pass dag_id — the method signature is unchanged. The only other similar query on the model (refresh_from_db) already included dag_id, so this was the sole gap.

Test: The tasks_with_different_dags case creates the exact collision scenario and would fail with MultipleResultsFound on pre-fix code.

One suggestion for making the regression test more self-documenting (non-blocking — see inline comment).

(Review comments generated with the help of Claude)

Vamsi-klu

The fix is correct. I traced through the code to confirm the bug is realistic and not just theoretical. The run_id for scheduled runs is generated as "scheduled__" (dagrun.py:822), and the dag_run table's unique constraint is on (dag_id, run_id), not run_id alone. So two different DAGs on the same schedule will produce identical run_id values. If they also share a task name (which is common with standardized patterns like "extract" or "load"), the old query without dag_id would match both rows and scalar_one_or_none raises MultipleResultsFound.

All four callers of this classmethod (edge executor, edge logs route, and two test utilities) pass dag_id and expect it to filter correctly, so this was a real production path.

On the test, the "tasks_with_different_dags" parametrized case is the one that actually exercises the fix. The "tasks_with_different_runs" case tests rows that already differ by run_id, which was in the filter before the fix, so that case would have passed even without the change. Might be worth adding a comment in the test noting that the second case is just a sanity check rather than a regression test for this specific bug, so future readers understand the intent.

The test name test_instance_dag_and_run_id_uniqueness reads like it is testing a database constraint. Something like test_get_task_instance_filters_by_dag_id would make it clearer that the method under test is get_task_instance and the thing being verified is that dag_id is actually used in the query.

Building on what Dev-iL suggested, adding assert found_1.dag_id == dag_id_1 alongside the id comparison would make the assertion more self-documenting. The id check proves the right row came back, but the dag_id check makes it immediately obvious what property is being verified.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes TaskInstance.get_task_instance() returning ambiguous results by ensuring the lookup filter includes dag_id, and adds a regression test to cover the composite-uniqueness behavior described in #64957.

Changes:

Add dag_id to the SQLAlchemy .filter_by(...) criteria in TaskInstance.get_task_instance().
Add a parametrized unit test that creates two TaskInstances differing by dag_id or run_id and verifies they can be fetched uniquely.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
airflow-core/src/airflow/models/taskinstance.py	Ensures `get_task_instance()` filters by `dag_id` to match the TI composite identity.
airflow-core/tests/unit/models/test_taskinstance.py	Adds a regression test validating `get_task_instance()` correctly disambiguates TaskInstances.

jscheffl

Oh, this is also a performance impact because w/o dag_id it can not use the index efficiently! LGTM!

boring-cyborg · 2026-04-11T18:27:25Z

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

(cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

github-actions · 2026-04-11T18:28:33Z

Backport successfully created: v3-2-test

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status	Branch	Result
✅	v3-2-test

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

holmuk requested review from XD-DENG and ashb as code owners April 9, 2026 16:38

Dev-iL requested review from eladkal, potiuk and vincbeck April 9, 2026 18:57

Dev-iL reviewed Apr 9, 2026

View reviewed changes

Comment thread airflow-core/tests/unit/models/test_taskinstance.py

Vamsi-klu reviewed Apr 10, 2026

View reviewed changes

Dev-iL mentioned this pull request Apr 10, 2026

Prevent cross-DAG task lookups from crashing the scheduler #64988

Closed

1 task

holmuk force-pushed the bugfix/ti-model-missing-dag-id branch from 2fa8fc0 to 4f8ea47 Compare April 10, 2026 06:57

Dev-iL approved these changes Apr 10, 2026

View reviewed changes

kaxil requested a review from Copilot April 10, 2026 19:55

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread airflow-core/tests/unit/models/test_taskinstance.py Outdated

Comment thread airflow-core/tests/unit/models/test_taskinstance.py

holmuk mentioned this pull request Apr 11, 2026

TaskInstance.get_task_instance ignores dag_id arg, crashes EdgeExecutor scheduler loop with MultipleResultsFound #65044

Closed

2 tasks

Fix missing dag_id in get_task_instance (apache#64957)

1ed5afe

holmuk force-pushed the bugfix/ti-model-missing-dag-id branch from 4f8ea47 to 1ed5afe Compare April 11, 2026 11:25

jscheffl approved these changes Apr 11, 2026

View reviewed changes

jscheffl added the backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch label Apr 11, 2026

potiuk added this to the Airflow 3.2.1 milestone Apr 11, 2026

potiuk approved these changes Apr 11, 2026

View reviewed changes

jscheffl merged commit 4ecbd59 into apache:main Apr 11, 2026
80 checks passed

github-actions Bot pushed a commit that referenced this pull request Apr 11, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968)

4d50511

(cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

jscheffl pushed a commit that referenced this pull request Apr 11, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968) (…

57ceb42

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

vatsrahul1001 pushed a commit that referenced this pull request Apr 15, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968) (…

62524de

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

vatsrahul1001 pushed a commit that referenced this pull request Apr 15, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968) (…

bdb3bae

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

vatsrahul1001 pushed a commit that referenced this pull request Apr 15, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968) (…

adb01b9

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

vatsrahul1001 pushed a commit that referenced this pull request Apr 15, 2026

[v3-2-test] Fix missing dag_id in get_task_instance (#64957) (#64968) (…

f2da6dc

…#65067) (cherry picked from commit 4ecbd59) Co-authored-by: holmuk <20281580+holmuk@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing dag_id in get_task_instance (#64957)#64968

Fix missing dag_id in get_task_instance (#64957)#64968
jscheffl merged 1 commit into
apache:mainfrom
holmuk:bugfix/ti-model-missing-dag-id

holmuk commented Apr 9, 2026 •

edited

Loading

Uh oh!

boring-cyborg Bot commented Apr 9, 2026

Uh oh!

Dev-iL left a comment •

edited

Loading

Uh oh!

Uh oh!

Vamsi-klu left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jscheffl left a comment

Uh oh!

Uh oh!

boring-cyborg Bot commented Apr 11, 2026

Uh oh!

github-actions Bot commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	UniqueConstraint("dag_id", "task_id", "run_id", "map_index", name="task_instance_composite_key"),
	ForeignKeyConstraint(

Conversation

holmuk commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Was generative AI tooling used to co-author this PR?

Uh oh!

boring-cyborg Bot commented Apr 9, 2026

Uh oh!

Dev-iL left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Vamsi-klu left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

jscheffl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

boring-cyborg Bot commented Apr 11, 2026

Uh oh!

github-actions Bot commented Apr 11, 2026

Backport successfully created: v3-2-test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

holmuk commented Apr 9, 2026 •

edited

Loading

Dev-iL left a comment •

edited

Loading