Skip to content

Conversation

@uranusjr
Copy link
Member

An XComArg's get_task_map_length() should only return an integer when the entire task has finished. However, before this patch, it may attempt to count a mapped upstream even when some (or all!) of its expanded tis are still unfinished, resulting its downstream to be expanded prematurely. This patch adds an additional check before we count upstream results to ensure all the upstreams are actually finished.

Fix #28973.

One thing I don’t like in this patch is it adds an additional SQL query for each resolution. I have a hunch the two queries can be combined, but I’m not sure how.

An XComArg's get_task_map_length() should only return an integer when
the *entire* task has finished. However, before this patch, it may
attempt to count a mapped upstream even when some (or all!) of its
expanded tis are still unfinished, resulting its downstream to be
expanded prematurely.

This patch adds an additional check before we count upstream results to
ensure all the upstreams are actually finished.
This needs a special workaround for a NULL quirk in SQL.
@uranusjr uranusjr force-pushed the investigate-28973-dynamic-run-too-early branch from 67a39d4 to 62313b4 Compare April 17, 2023 08:30
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@potiuk potiuk merged commit 5f2628d into apache:main Apr 22, 2023
@potiuk potiuk added this to the Airflow 2.6.0 milestone Apr 22, 2023
@potiuk
Copy link
Member

potiuk commented Apr 22, 2023

Another potential candidate for 2.6.0 @ephraimbuddy @pierrejeambrun @jedcunningham . Marked it so.

@ephraimbuddy ephraimbuddy deleted the investigate-28973-dynamic-run-too-early branch April 23, 2023 15:26
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Apr 23, 2023
ephraimbuddy pushed a commit that referenced this pull request Apr 23, 2023
* Fix Pydantic TI handling in XComArg.resolve()

* Count mapped upstreams only if all are finished

An XComArg's get_task_map_length() should only return an integer when
the *entire* task has finished. However, before this patch, it may
attempt to count a mapped upstream even when some (or all!) of its
expanded tis are still unfinished, resulting its downstream to be
expanded prematurely.

This patch adds an additional check before we count upstream results to
ensure all the upstreams are actually finished.

* Use SQL IN to find unfinished TI instead

This needs a special workaround for a NULL quirk in SQL.

(cherry picked from commit 5f2628d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic Task Mapping skips tasks before upstream has started

4 participants