-
Notifications
You must be signed in to change notification settings - Fork 16.4k
fix: Re-add configuring orm for OpenLineage's listener on scheduler #48049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| settings.configure_orm() | ||
| # This initializer is used only on the scheduler | ||
| # We can configure_orm regardless of the Airflow version, as DB access is always allowed from scheduler. | ||
| settings.configure_orm() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb Wasn't this causing workers to fail too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the following error in @ashb 's PR:
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:244 in on_task_instance_success
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:326 in _on_task_instance_success
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:452 in _execute
/usr/local/lib/python3.12/site-packages/airflow/providers/openlineage/plugins/listener.py:484 in _fork_execute
/usr/local/lib/python3.12/site-packages/airflow/settings.py:363 in configure_orm
<string>:2 in create_engine
/usr/local/lib/python3.12/site-packages/sqlalchemy/util/deprecations.py:375 in warned
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/create.py:514 in create_engine
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/url.py:738 in make_url
/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/url.py:799 in _parse_url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aah there were two instances of settings.configure_orm(), ok 👍
in _fork_execute & _executor_initializer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly, the DAG level events follow different path - now even more with task sdk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kacpermuda maybe splitting those logically into two listeners now would be a good idea? Those won't share as much logic now. Maybe after Airflow 2 goes out of provider scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kacpermuda maybe splitting those logically into two listeners now would be a good idea? Those won't share as much logic now. Maybe after Airflow 2 goes out of provider scope?
Yeah, I like that idea
When DB access was disabled for OpenLineage running on workers in #47580, the configure_orm call was also removed from the methods running on the scheduler. This has been re-added, along with comments to clarify where it’s used. Lack of ORM configuration caused errors when OpenLineage accessed DB from scheduler and caused the scheduler to crash.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.