Skip to content

Conversation

@benitezfede
Copy link

@benitezfede benitezfede commented Jul 22, 2025

closes: #53618
related: #52824

Description

  • This PR fixes a bug for Airflow version >= 3 where DAG callbacks (on_success_callback and on_failure_callback) were missing the dag_run object in their context, making it impossible for users to access important DAG run information in their callback functions.
  • Airflow 2 has the dag_run in the DAG callbacks context which means this is fixing a breaking change.

Problem

Currently, when DAG callbacks are executed, the context only contains:

  • dag: The DAG object
  • run_id: The run ID as a string
  • reason: The reason for the callback

This is inconsistent with task callbacks and with previous versions of Airflow, which have access to the full dag_run object.

Solution

The solution addresses the Airflow 3.0 constraint that disallows direct ORM access in subprocesses by serializing the dag_run data and passing it through the callback request:

  1. Added dag_run field to DagCallbackRequest: The request now accepts serialized DagRun data as a dictionary
  2. Created DagRun.serialize_for_callback() method: Converts DagRun objects to dictionaries with all necessary fields
  3. Updated callback creation points: Modified dagrun.py and scheduler_job_runner.py to use the new serialization method
  4. Enhanced _execute_dag_callbacks: Now includes dag_run in the context when provided
  5. Updated type annotations: Context type now supports both DagRunProtocol and dictionary types

Changes Made

  • DagCallbackRequest: Added optional dag_run field to store serialized DagRun data
  • DagRun.serialize_for_callback(): New method that converts DagRun objects to dictionaries with ISO-formatted datetime strings
  • dagrun.py: Updated handle_dag_callback to use serialize_for_callback() when creating callback requests
  • scheduler_job_runner.py: Updated to include serialized dag_run data in callback requests
  • processor.py: Modified _execute_dag_callbacks to include dag_run in context when provided
  • Type annotations: Updated Context type to support both DagRunProtocol and dictionary types
  • Comprehensive tests: Added tests for serialization, callback execution, and edge cases

Technical Implementation

The solution avoids the Airflow 3.0 ORM restriction by:

  • Serializing dag_run data at the source (when callbacks are created)
  • Passing the serialized data through the DagCallbackRequest
  • Reconstructing the context in the callback execution without database access

Testing

  • Added comprehensive tests in test_callback_requests.py for the new dag_run field
  • Added tests in test_dagrun.py for the serialize_for_callback() method
  • Added integration tests in test_processor.py for callback execution with dag_run data
  • Tests cover backward compatibility, serialization/deserialization, and edge cases
  • All tests pass and maintain backward compatibility

Breaking Changes

  • None. This is a backward-compatible enhancement that adds missing functionality.
  • Existing callbacks without dag_run data continue to work as before
  • The dag_run field is only added to the context when explicitly provided

Related Issues

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@boring-cyborg
Copy link

boring-cyborg bot commented Jul 22, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@benitezfede benitezfede force-pushed the fix/dag_run_missing_in_dag_callback_context branch from 33ed343 to a2f2c4d Compare July 22, 2025 22:19
@benitezfede benitezfede marked this pull request as ready for review July 22, 2025 22:34
@benitezfede benitezfede requested review from XD-DENG and ashb as code owners July 22, 2025 22:34
@eladkal eladkal added this to the Airflow 3.0.4 milestone Jul 23, 2025
@eladkal eladkal added type:bug-fix Changelog: Bug Fixes backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch labels Jul 23, 2025
"dag": dag,
"run_id": str(self.run_id),
"reason": reason,
"dag_run": self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to fix it. As the docstring says, this function is only used in dag.test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check #53058 for similar pattern

This comment was marked as outdated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I managed to get the dag_run information without querying from the database and pass it on until I can set it in the callback context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, although since we need more keys, I was working on it in parallel since it became a blocker for some. Check #53684

@benitezfede benitezfede force-pushed the fix/dag_run_missing_in_dag_callback_context branch from a2f2c4d to 6cf6912 Compare July 23, 2025 14:09
@benitezfede benitezfede requested a review from kaxil July 23, 2025 21:06
@benitezfede benitezfede force-pushed the fix/dag_run_missing_in_dag_callback_context branch from 78519d7 to 5cbf762 Compare July 23, 2025 21:07
kaxil added a commit to astronomer/airflow that referenced this pull request Jul 23, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
Comment on lines +2020 to +2042
def serialize_for_callback(self) -> dict[str, Any]:
"""
Serialize DagRun object into a dictionary for callback requests.

This method creates a serialized representation of the DagRun that can be
safely passed to subprocesses without requiring database access.

:return: Dictionary containing serialized DagRun information
"""
return {
"dag_id": self.dag_id,
"run_id": self.run_id,
"state": self.state,
"logical_date": self.logical_date.isoformat() if self.logical_date else None,
"start_date": self.start_date.isoformat() if self.start_date else None,
"end_date": self.end_date.isoformat() if self.end_date else None,
"conf": self.conf,
"run_type": self.run_type,
"run_after": self.run_after.isoformat() if self.run_after else None,
"data_interval_start": self.data_interval_start.isoformat() if self.data_interval_start else None,
"data_interval_end": self.data_interval_end.isoformat() if self.data_interval_end else None,
}

Copy link
Member

@kaxil kaxil Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this, check #53684

That should take care of that and restore the Airflow 2 behavior

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean that the entire PR is no longer needed because it is being addressed in #53684 or just that part of the code you highlighted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entire PR. And I do apologize though about the time spent here. I can certainly say this will be useful context for future contributions. I do really appreciate you taking the time and hope you will continue contributing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this issue!

kaxil added a commit to astronomer/airflow that referenced this pull request Jul 23, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
kaxil added a commit to astronomer/airflow that referenced this pull request Jul 23, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
kaxil added a commit to astronomer/airflow that referenced this pull request Jul 24, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
kaxil added a commit to astronomer/airflow that referenced this pull request Jul 24, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
kaxil added a commit that referenced this pull request Jul 24, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to #53058

Fixes #52824
Fixes #51402
Closes #51949
Related to #53654
Related to #53618
@kaxil kaxil closed this in #53684 Jul 24, 2025
ferruzzi pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Aug 7, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
kaxil added a commit that referenced this pull request Aug 9, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to #53058

Fixes #52824
Fixes #51402
Closes #51949
Related to #53654
Related to #53618

(cherry picked from commit ef80507)
fweilun pushed a commit to fweilun/airflow that referenced this pull request Aug 11, 2025
This ensures DAG callbacks receive the same rich context as task callbacks,
improving consistency and providing access to template variables and macros similar to Airflow 2.

This has been a blocker for few users similar to apache#53058

Fixes apache#52824
Fixes apache#51402
Closes apache#51949
Related to apache#53654
Related to apache#53618
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants