Skip to content

Conversation

@anshuksi282-ksolves
Copy link
Contributor

Issue: #56306
FIX: Prevent AttributeError when updating SerializedDagModel for dynamic DAGs

This PR fixes a stability issue in SerializedDagModel.write_dag by adding a null check before updating an existing record for dynamic DAGs.

Context and Problem

In the section of write_dag dedicated to updating dynamic DAGs (i.e., when dag_version exists but has no task instances), the code retrieves the latest SerializedDagModel instance using cls.get(dag.dag_id, session=session).

If cls.get(dag.dag_id, session=session) returns None (for example, due to a race condition where the record was just deleted or not found in the current session), the subsequent code attempts to modify attributes:

latest_ser_dag._data = new_serialized_dag._data

This would raise an AttributeError because latest_ser_dag is None.

Solution

  • Added a null check to ensure that latest_ser_dag exists before updating its attributes.
  • This prevents AttributeError and improves the stability of DAG serialization for dynamic DAGs.

^ Add meaningful description above

Read the Pull Request Guidelines for more information.

In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.

In case of a new dependency, check compliance with the ASF 3rd Party License Policy.

In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Oct 6, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@anshuksi282-ksolves
Copy link
Contributor Author

Hi ,I’ve fixed a stability issue in SerializedDagModel.write_dag for dynamic DAGs (#56306).
Added a null check before updating an existing record to prevent AttributeError.
Tested locally, and dynamic DAG serialization now works reliably.
Appreciate your review and feedback. Thanks!

@anshuksi282-ksolves anshuksi282-ksolves force-pushed the fix-none-type-serialized-dag branch from 7178d7c to b5f07b5 Compare October 16, 2025 14:19
Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like bad rebase, please update and have only the query change

@anshuksi282-ksolves anshuksi282-ksolves force-pushed the fix-none-type-serialized-dag branch 2 times, most recently from e55a5c9 to b5f07b5 Compare October 17, 2025 09:39
@anshuksi282-ksolves anshuksi282-ksolves force-pushed the fix-none-type-serialized-dag branch from b5f07b5 to 8ddadaf Compare October 17, 2025 09:48
@anshuksi282-ksolves
Copy link
Contributor Author

Hi @ephraimbuddy,

Thanks for reviewing! Sorry about the earlier confusion — the bad rebase included extra changes by mistake.
I’ve now cleaned up the branch so it contains only the query change in SerializedDagModel.

Appreciate your patience and review!

@anshuksi282-ksolves
Copy link
Contributor Author

anshuksi282-ksolves commented Oct 23, 2025

Hi @ephraimbuddy,

Thanks for the approval!

I see the CI tests are failing with the error: AttributeError: 'ScalarResult' object has no attribute 'limit'.
I also noticed the where clause in the query might be too strict (filtering by bundle_version and bundle_name), which could cause issues like the assert 6 == 4 failure I saw in other test logs.

I believe I can fix both issues at once. The plan is:

  • Fix the AttributeError by moving .limit(1) inside the select() and adding .first().
  • Fix the logic error by updating the where() clause to only filter by dag_id.
dag_version = session.scalars(
            select(DagVersion)
            .where(
                DagVersion.dag_id == dag.dag_id,
            )
            .options(joinedload(DagVersion.task_instances))
            .options(joinedload(DagVersion.serialized_dag))
            .order_by(DagVersion.created_at.desc())
            .limit(1)
        ).first() 

Should I go ahead and push this combined fix?

Thanks!

@ephraimbuddy
Copy link
Contributor

Maybe:

dag_version = session.scalar(
            select(DagVersion)
            .where(
                DagVersion.dag_id == dag.dag_id,
                DagVersion.bundle_name == bundle_name
            )
            .options(joinedload(DagVersion.task_instances))
            .options(joinedload(DagVersion.serialized_dag))
            .order_by(DagVersion.created_at.desc())
            .limit(1)
        )

In the above, I used scalar instead of scalars and removed first(), there's no need since limit is 1. I removed bundle_version in query. I think this will work

@anshuksi282-ksolves anshuksi282-ksolves force-pushed the fix-none-type-serialized-dag branch from 83ad489 to 98a1300 Compare October 24, 2025 11:52
@anshuksi282-ksolves
Copy link
Contributor Author

Hi @ephraimbuddy, thanks! I've updated the code to use session.scalar() as you suggested
Do I still need to add theif latest_ser_dag:null check to fix the original NoneType error, or is this query change sufficient on its own?

@ephraimbuddy
Copy link
Contributor

Hi @ephraimbuddy, thanks! I've updated the code to use session.scalar() as you suggested Do I still need to add theif latest_ser_dag:null check to fix the original NoneType error, or is this query change sufficient on its own?

You can do that. Then use return statement when the latest_ser_dag is null. You can return False in null case and get rid of assert check

@anshuksi282-ksolves anshuksi282-ksolves force-pushed the fix-none-type-serialized-dag branch from 98a1300 to 1d729ca Compare October 24, 2025 13:34
@anshuksi282-ksolves
Copy link
Contributor Author

Hi @ephraimbuddy, I've rebased and pushed the changes.

I used session.scalar() as you suggested, thanks! I noticed you included bundle_name in the query, but that was causing the assert 6 == 4 test to fail (it was creating new rows instead of updating).

So, I've kept only dag_id in the where clause, which I think will fix it.

I also added the if not latest_ser_dag: return False check for the original NoneType error, as you confirmed.

This should hopefully fix all the CI failures now. Thanks!

@ephraimbuddy ephraimbuddy merged commit eeb203e into apache:main Oct 27, 2025
60 checks passed
ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Oct 30, 2025
It was wrong to load the serdag and not use it. The initial idea was
to use the serdag at line 437 but was omitted. Thinking about it now,
it will be faster to only load serdag when there's a TI associated
with the dag version
kaxil pushed a commit that referenced this pull request Oct 30, 2025
It was wrong to load the serdag and not use it. The initial idea was
to use the serdag at line 437 but was omitted. Thinking about it now,
it will be faster to only load serdag when there's a TI associated
with the dag version
Copilot AI pushed a commit to jason810496/airflow that referenced this pull request Dec 5, 2025
It was wrong to load the serdag and not use it. The initial idea was
to use the serdag at line 437 but was omitted. Thinking about it now,
it will be faster to only load serdag when there's a TI associated
with the dag version
@ephraimbuddy ephraimbuddy added this to the Airflow 3.1.7 milestone Jan 22, 2026
ephraimbuddy pushed a commit that referenced this pull request Jan 27, 2026
ephraimbuddy added a commit that referenced this pull request Jan 27, 2026
It was wrong to load the serdag and not use it. The initial idea was
to use the serdag at line 437 but was omitted. Thinking about it now,
it will be faster to only load serdag when there's a TI associated
with the dag version

(cherry picked from commit e5a88cc)
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants