Skip to content

Conversation

@kacpermuda
Copy link
Contributor

@kacpermuda kacpermuda commented Mar 21, 2025

When DB access was disabled for OpenLineage running on workers in #47580, the configure_orm call was also removed from the methods running on the scheduler. This has been re-added, along with comments to clarify where it’s used. Lack of ORM configuration caused errors when OpenLineage accessed DB from scheduler and caused the scheduler to crash.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@mobuchowski mobuchowski merged commit 2b082e6 into apache:main Mar 21, 2025
62 checks passed
@kacpermuda kacpermuda deleted the fix-ol-listener-db-access branch March 21, 2025 11:37
settings.configure_orm()
# This initializer is used only on the scheduler
# We can configure_orm regardless of the Airflow version, as DB access is always allowed from scheduler.
settings.configure_orm()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb Wasn't this causing workers to fail too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the following error in @ashb 's PR:

/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:244 in on_task_instance_success
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:326 in _on_task_instance_success
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:452 in _execute
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:484 in _fork_execute
/usr/local/lib/python3.12/site-packages/airflow/settings.py:363 in configure_orm
<string>:2 in create_engine
/usr/local/lib/python3.12/site-packages/sqlalchemy/util/deprecations.py:375 in warned
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/create.py:514 in create_engine
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/url.py:738 in make_url
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/url.py:799 in _parse_url

Copy link
Member

@kaxil kaxil Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah there were two instances of settings.configure_orm(), ok 👍

in _fork_execute & _executor_initializer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, the DAG level events follow different path - now even more with task sdk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kacpermuda maybe splitting those logically into two listeners now would be a good idea? Those won't share as much logic now. Maybe after Airflow 2 goes out of provider scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kacpermuda maybe splitting those logically into two listeners now would be a good idea? Those won't share as much logic now. Maybe after Airflow 2 goes out of provider scope?

Yeah, I like that idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants