Skip to content

Conversation

@nothingmin
Copy link
Contributor

In Airflow 2, the data_interval was set to logical_date (execution_date) when triggering a DAG run.
In Airflow 3, the current data_interval is set to run_after, which causes issues when using data_interval_end.
This commit addresses this problem.

For reference, the code in Airflow 2 looked like this:
스크린샷 2025-08-05 오후 4 51 12

@boring-cyborg
Copy link

boring-cyborg bot commented Aug 5, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Aug 5, 2025
@nothingmin nothingmin changed the title fix: set data_interval to logical_date when triggering DAG run Fix set data_interval to logical_date when triggering DAG run Aug 6, 2025
Comment on lines 88 to 87
data_interval = dag.timetable.infer_manual_data_interval(run_after=run_after)
data_interval = dag.timetable.infer_manual_data_interval(run_after=coerced_logical_date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not look right. The method takes a run_after, passing in something else is just wrong.

@ashb
Copy link
Member

ashb commented Aug 7, 2025

What problems does this cause? Can you be more specific?

This might have been one of the purposefully breaking changes between v2 and v3, or it could be a bug. It's not clear at the moment.

@nothingmin
Copy link
Contributor Author

nothingmin commented Aug 7, 2025

Thank you for your feedback!
@ashb @uranusjr

To clarify the issue:
We have a DAG that is scheduled to run every 10 minutes, and it uses the TriggerDagRunOperator to trigger another DAG. Previously (in Airflow 2), the triggered DAG run’s data_interval would be set based on the logical_date specified at trigger time (so the downstream DAG run would process the correct interval).

However, in Airflow 3, the data_interval of the triggered DAG run is instead set based on the time when the run is queued, not the intended logical date. As a result, the downstream DAG receives an unexpected data interval that does not match the desired scheduling semantics.

I believe this is a bug rather than an intentional breaking change, since triggering workflows for specific time intervals is a core use case, especially when chaining DAGs for partitioned data processing. This PR is intended to restore the correct behavior for triggered DAG runs.

I am attaching screenshots of the same DAG run’s details in both Airflow 2 and Airflow 3 for comparison.
스크린샷 2025-08-08 오전 12 43 00
스크린샷 2025-08-08 오전 12 44 28

Please let me know if you’d like more detailed examples or further clarification!

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case seems fair.

This was changed in #46512, and seems deliberate because it was based on coerced_logical_date and changed to run_after, maybe we are missing something @sunank200

@uranusjr
Copy link
Member

I believe this is intentional. The argument to the timetable method is run_after, and the standard logic to calculate a data interval from that datetime value finds an interval closest but before it. Both characteristics indicate that run_after is a more reasonable input than the logical date.

Since you just want to have a date (the data interval starts and ends at the same datetime anyway in either case), why not just use the logical date value directly? It is the same across both major versions.

@nothingmin
Copy link
Contributor Author

I agree that writing DAGs to use logical_date, as you suggest, works, and I also understand your reasoning that run_after may be a more appropriate value in some cases.

However, the official Airflow documentation recommends using data_interval_start rather than logical_date.

More importantly, this change was introduced as a breaking change without prior notice, so many existing DAGs no longer work as expected. Given these factors, I would appreciate it if you could reconsider this behavior—perhaps by maintaining the previous approach or providing a clear way for users to choose the desired logic.

Thank you!

@potiuk potiuk added the affected_version:3.0 Issues Reported for 3.0 label Aug 11, 2025
@potiuk potiuk added this to the Airflow 3.0.5 milestone Aug 11, 2025
@potiuk
Copy link
Member

potiuk commented Aug 11, 2025

Not deciding whether it's a valid change - I marked it to 3.0.5 so that we do not forget about it when we prepare it.

@kaxil kaxil modified the milestones: Airflow 3.0.5, Airflow 3.0.6 Aug 13, 2025
@eladkal
Copy link
Contributor

eladkal commented Aug 21, 2025

There are conflicts to resolve

@nothingmin
Copy link
Contributor Author

@eladkal
resolved it

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a path forward would be to update the documentation instead, and keep the run_after param ?

In Airflow 2, the data_interval was set to logical_date (execution_date) when triggering a DAG run.
In Airflow 3, the current data_interval is set to run_after, which causes issues when using data_interval_end.
This commit addresses this problem.
@nothingmin
Copy link
Contributor Author

@pierrejeambrun @potiuk @uranusjr @ashb
I can also follow up with a documentation PR instead, if we agree this behavior is intended.

@kaxil kaxil modified the milestones: Airflow 3.0.7, Airflow 3.1.0 Sep 13, 2025
@kaxil
Copy link
Member

kaxil commented Sep 15, 2025

@uranusjr @pierrejeambrun What's the next step on this one?

(Moving the milestone for now)

@kaxil kaxil modified the milestones: Airflow 3.1.0, Airflow 3.1.1 Sep 15, 2025
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As TP mentioned, I think this is expected and run_after is more reasonable.

I would be for updating the documentation to reflect that, maybe add significant release note entry for 3.0.0 to mention that this has changed?

@kaxil kaxil removed this from the Airflow 3.1.1 milestone Oct 21, 2025
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Dec 30, 2025
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nothingmin do you want to follow up and convert this PR into a documentation one, and explain that this change was intented? I think it would be a great addition and help clear confusion for others too.

@nothingmin
Copy link
Contributor Author

@pierrejeambrun Sure, I'll follow up and convert this PR into a documentation one to clarify this intentional behavior change. I'll update it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affected_version:3.0 Issues Reported for 3.0 area:API Airflow's REST/HTTP API stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants