Skip to content

Prevent Triggerer from crashing when a trigger event isn't serializable#60152

Merged
dabla merged 31 commits intoapache:mainfrom
dabla:fix/trigger-crash-serialization-comms
Jan 22, 2026
Merged

Prevent Triggerer from crashing when a trigger event isn't serializable#60152
dabla merged 31 commits intoapache:mainfrom
dabla:fix/trigger-crash-serialization-comms

Conversation

@dabla
Copy link
Contributor

@dabla dabla commented Jan 6, 2026

This PR is related to the issue #60120 in which the trigger process crashes if a returned trigger event isn't serializable. This PR will catch the NotImpelmentedError and detect which event caused the crash and cancel the trigger associated to that event that caused the error.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

…mplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger
@dabla dabla marked this pull request as draft January 6, 2026 08:12
@dabla dabla marked this pull request as ready for review January 8, 2026 13:09
@potiuk
Copy link
Member

potiuk commented Jan 8, 2026

Looks good and seems it would be great to have it in 3.1.6 - but I would love someone with more triggerer expertise reviews it.

@potiuk potiuk added this to the Airflow 3.1.6 milestone Jan 8, 2026
…etry call are correctly asserted and bad events are being sanitized
@dabla
Copy link
Contributor Author

dabla commented Jan 19, 2026

@ash I refactored the code as discussed, first I will just try to send the events as is, if that fails, it will first sanitize the events and then retry to send it. I also adapted the test with asserts and verifications, now that scenario is fully covered.

@dabla dabla requested a review from ashb January 19, 2026 21:25
@dabla dabla merged commit 0238244 into apache:main Jan 22, 2026
71 checks passed
@github-actions
Copy link

Backport failed to create: v3-1-test. View the failure log Run details

Status Branch Result
v3-1-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 0238244 v3-1-test

This should apply the commit to the v3-1-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

dabla added a commit that referenced this pull request Jan 22, 2026
…le (#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

(cherry picked from commit 0238244)
@dabla
Copy link
Contributor Author

dabla commented Jan 22, 2026

Backported to v3-1-test

@pierrejeambrun
Copy link
Member

pierrejeambrun commented Jan 23, 2026

@dabla how did you backport this? I think it's breaking the CI and I can't find the related PR, I'm surprised this could be merged on v3-1-test without static check succeeding.

I have opened a revert PR there #60976 to see if the CI is happy.

pierrejeambrun added a commit to astronomer/airflow that referenced this pull request Jan 23, 2026
pierrejeambrun added a commit that referenced this pull request Jan 23, 2026
@dabla
Copy link
Contributor Author

dabla commented Jan 23, 2026

@dabla how did you backport this? I think it's breaking the CI and I can't find the related PR, I'm surprised this could be merged on v3-1-test without static check succeeding.

I have opened a revert PR there #60976 to see if the CI is happy.

Salut @pierrejeambrun. Well I did as was described above, I checked out the v3-1-test, cherry-picked the commit from this PR, resolved merged conflicts, and pushed it. Will create branch from v3-1-test and create PR from there, thx for reverting it.

dabla added a commit that referenced this pull request Jan 23, 2026
…le (#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

(cherry picked from commit 0238244)
@dabla
Copy link
Contributor Author

dabla commented Jan 23, 2026

@pierrejeambrun I've create PR #60981 for the backport. Thx to @shahar1 for helping me out.

dabla added a commit that referenced this pull request Jan 24, 2026
…le (#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

(cherry picked from commit 0238244)
dabla added a commit that referenced this pull request Jan 24, 2026
…le (#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

(cherry picked from commit 0238244)
dabla added a commit that referenced this pull request Jan 24, 2026
…le (#60152) (#60981)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

(cherry picked from commit 0238244)
@dabla
Copy link
Contributor Author

dabla commented Jan 24, 2026

PR Merged into v3-1-test.

suii2210 pushed a commit to suii2210/airflow that referenced this pull request Jan 26, 2026
…le (apache#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
@ephraimbuddy ephraimbuddy added type:bug-fix Changelog: Bug Fixes changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) and removed type:bug-fix Changelog: Bug Fixes labels Jan 27, 2026
shreyas-dev pushed a commit to shreyas-dev/airflow that referenced this pull request Jan 29, 2026
…le (apache#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
jhgoebbert pushed a commit to jhgoebbert/airflow_Owen-CH-Leung that referenced this pull request Feb 8, 2026
…le (apache#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
choo121600 pushed a commit to choo121600/airflow that referenced this pull request Feb 22, 2026
…le (apache#60152)

* refactor: If asend in TriggerRunner comms decoder crashes due to NotImplementedError as a trigger event is not serializable, then retry without that event and cancel associated trigger

* refactor: Applied some reformatting

* refactor: Fixed some mypy issues

* refactor: Fixed return type of send_changes method

* refactor: Changed imports of trigger events

* refactor: Reformatted trigger job runner

* refactor: Fixed mocking of comms decoder

* refactor: Forgot to add the patched supervisor_builder

* refactor: Changed asserts in test_sync_state_to_supervisor

* refactor: Refactored how state changes are validated

* refactor: Validate events while creating the TriggerStateChanges message

* refactor: Refactored try/except in validate_state_changes to keep mypy happy

* Update airflow-core/src/airflow/jobs/triggerer_job_runner.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* refactor: Renamed validate_state_changes method to process_trigger_events

* refactor: Only sanitize invalid trigger events if first attempt fails

* refactor: Should check if msg.events is not None

* refactor: Adapted test_sync_state_to_supervisor so both initial and retry call are correctly asserted and bad events are being sanitized

---------

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Triggerer backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants