Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix all deprecations for SQLAlchemy 2.0 #28723

Open
12 of 27 tasks
potiuk opened this issue Jan 4, 2023 · 43 comments
Open
12 of 27 tasks

Fix all deprecations for SQLAlchemy 2.0 #28723

potiuk opened this issue Jan 4, 2023 · 43 comments
Assignees
Labels
good first issue kind:meta High-level information important to the community

Comments

@potiuk
Copy link
Member

potiuk commented Jan 4, 2023

Body

Airflow is currently not compatible with SQLAlchemy 2.0 which is about to be released. We need to make a deliberate effort
to support it.

Here are some info to aid in this effort:

  • Description of all removed featuers in SQLAlchemy 2.0: https://sqlalche.me/e/b8d9

  • The way to see all the deprecations in latest version of SQLAlchemy 1.4 is to set environment variable SQLALCHEMY_WARN_20=1

  • How to start it:

    • add SQLALCHEMY_WARN_20=1 to ci.yml at top level
    • look through warnings.txt in tests to investigate the warnings (there will also be errors in the provider imports)
    • when all of them are fixed - remove <2.0 limitation in sqlalchemy in setup.cfg
    • make sure CI is green

Local setup in PyCharm

If you use PyCharm for run tests, you might want to setup SQLALCHEMY_WARN_20=1 for all pytest runs by default

  1. Go to Run -> Edit Configurations... -> click on Edit configuration templates...
    image
  2. In open window select Python Tests -> pytest and put SQLALCHEMY_WARN_20=1 into the environment variable section
    image
  3. Click Apply, Click OK
  4. Thats it, every new pytest run configuration will have this option by default

Known non-compatible with SA 20

Reported in core

  • airflow/models/taskinstance.py:1869
    sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/models/taskinstance.py:1871
    sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/www/views.py:845
    sqlalchemy.exc.RemovedIn20Warning - The "columns" argument to Select.with_only_columns(), when referring to a sequence of items, is now passed as a series of positional elements, rather than as a list. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/www/views.py:3455
    sqlalchemy.exc.RemovedIn20Warning - The Row.keys() method is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Use the namedtuple standard accessor Row._fields, or for full mapping behavior use row._mapping.keys() (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/www/views.py:3455
    sqlalchemy.exc.RemovedIn20Warning - Retrieving row members using strings or other non-integers is deprecated; use row._mapping for a dictionary interface to the row (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/www/views.py:3293
    sqlalchemy.exc.RemovedIn20Warning - The Row.keys() method is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Use the namedtuple standard accessor Row._fields, or for full mapping behavior use row._mapping.keys() (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/www/views.py:3293
    sqlalchemy.exc.RemovedIn20Warning - Retrieving row members using strings or other non-integers is deprecated; use row._mapping for a dictionary interface to the row (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/jobs/scheduler_job_runner.py:1642
    sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/models/trigger.py:141
    sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/utils/db_cleanup.py:200
    sqlalchemy.exc.RemovedIn20Warning - The bind argument for schema methods that invoke SQL against an engine or connection will be required in SQLAlchemy 2.0. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/cli/commands/task_command.py:202
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • airflow/models/taskinstance.py:524
    sqlalchemy.exc.RemovedIn20Warning - Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0. Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

Reported in providers

Reported in tests

  • tests/providers/fab/auth_manager/api_endpoints/test_user_schema.py:63
    sqlalchemy.exc.RemovedIn20Warning - "User" object is being merged into a Session along the backref cascade path for relationship "Role.user"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/jobs/test_backfill_job.py:1912
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/models/test_dagrun.py:2090
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/models/test_taskinstance.py:1487
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/ti_deps/deps/test_trigger_rule_dep.py:140
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/utils/test_log_handlers.py:475
    sqlalchemy.exc.RemovedIn20Warning - "Trigger" object is being merged into a Session along the backref cascade path for relationship "TaskInstance.trigger"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/api_experimental/common/test_mark_tasks.py:137
    sqlalchemy.exc.RemovedIn20Warning - The eagerload construct is considered legacy as of the 1.x series of SQLAlchemy and will be removed in 2.0. Please use joinedload(). (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/api_connexion/endpoints/test_task_instance_endpoint.py:190
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/api_connexion/schemas/test_task_instance_schema.py:69
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/api_connexion/schemas/test_task_instance_schema.py:113
    sqlalchemy.exc.RemovedIn20Warning - "TaskInstance" object is being merged into a Session along the backref cascade path for relationship "DagRun.task_instances"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/serialization/test_pydantic_models.py:162
    sqlalchemy.exc.RemovedIn20Warning - "DagScheduleDatasetReference" object is being merged into a Session along the backref cascade path for relationship "DatasetModel.consuming_dags"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

  • tests/serialization/test_pydantic_models.py:163
    sqlalchemy.exc.RemovedIn20Warning - "TaskOutletDatasetReference" object is being merged into a Session along the backref cascade path for relationship "DatasetModel.producing_tasks"; in SQLAlchemy 2.0, this reverse cascade will not take place. Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@potiuk potiuk added the kind:meta High-level information important to the community label Jan 4, 2023
@Taragolis
Copy link
Contributor

Just clarification. Our target support both 1.4.x and 2.0 or only 2.0?

@potiuk
Copy link
Member Author

potiuk commented Jan 4, 2023

Just clarification. Our target support both 1.4.x and 2.0 or only 2.0?

This is a good point, and I do not know the answer. I think it will entirely depend on:

a) when we will do it
b) how difficult it will be to keep compatibility
c) some indication of the adoption of 2.0 once it is out there

I would refrain from trying to figure the answer before the final 2.0 release is out there - and possibly we should even wait for at least 2.0.1 or so, generally we should scout for stability indications. And then we should attempt to migrate and see how difficult it is to keep it compatible with both.

If we will decide to support both, due to many differences and generally backwards incompatible release of sqlalchemy I would be only heppy with supporting both if we extend our test suite in CI and add another test matrix dimention - sqlalchemy_version (on top of backend + backend version, Python version) that we have now.

But we will also have to see if that is really needed, depending on how complex the fixing will be and how different the two versions are. If the differences are small, we can likely get-by without having the extra matrix dimension.

If we see that differences are big, I think going 2.0-only is a better approach (and it basically means we will have to wait at least 6-10 months after release depending on adoption IMHO).

There is absolutely no hurry with migration yet, 1.4 will be out there for a long while and it will be supported with bugfixes - so we do not have (and will not have for a while) particular need to migrate to 2.0.

@Taragolis
Copy link
Contributor

Migrate 2.0 to 1.4 potentially could have more effect rather than we have with other packages.
Some of dependency of providers also use SQLAlchemy and we need to wait until them migrate to 2.0

Current packages which explicit have SQLAlchemy as dependency from CI image.

root@6e6ad04bdb9b:/opt/airflow# pipdeptree --reverse --packages SQLAlchemy --python `which python`

sqlalchemy==1.4.46
  - alembic==1.9.1 [requires: SQLAlchemy>=1.3.0]
    - apache-airflow==2.6.0.dev0 [requires: alembic>=1.6.3,<2.0]
  - apache-airflow==2.6.0.dev0 [requires: sqlalchemy>=1.4,<2.0]
  - elasticsearch-dbapi==0.2.9 [requires: sqlalchemy]
  - eralchemy2==1.3.6 [requires: SQLAlchemy>=1.3.19]
  - Flask-AppBuilder==4.1.4 [requires: SQLAlchemy<1.5]
    - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - Flask-SQLAlchemy==2.5.1 [requires: SQLAlchemy>=0.8.0]
    - Flask-AppBuilder==4.1.4 [requires: Flask-SQLAlchemy>=2.4,<3]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - marshmallow-sqlalchemy==0.26.1 [requires: SQLAlchemy>=1.2.0]
    - Flask-AppBuilder==4.1.4 [requires: marshmallow-sqlalchemy>=0.22.0,<0.27.0]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]
  - snowflake-sqlalchemy==1.4.4 [requires: sqlalchemy>=1.4.0,<2.0.0]
  - sqlalchemy-bigquery==1.5.0 [requires: sqlalchemy>=1.2.0,<2.0.0dev]
  - sqlalchemy-drill==1.1.2 [requires: sqlalchemy]
  - SQLAlchemy-JSONField==1.0.1.post0 [requires: sqlalchemy]
    - apache-airflow==2.6.0.dev0 [requires: sqlalchemy-jsonfield>=1.0]
  - sqlalchemy-redshift==0.8.12 [requires: SQLAlchemy>=0.9.2,<2.0.0]
  - SQLAlchemy-Utils==0.39.0 [requires: SQLAlchemy>=1.3]
    - Flask-AppBuilder==4.1.4 [requires: sqlalchemy-utils>=0.32.21,<1]
      - apache-airflow==2.6.0.dev0 [requires: flask-appbuilder==4.1.4]

This is not complete list because some of packages could use sqlalchemy but do not have it in dependency, for example pandas use it in read_sql.

@uranusjr
Copy link
Member

uranusjr commented Jan 4, 2023

I think we should support both 1.4 and 2.0 for at least one minor release, preferrably much longer. Dependencies don’t tend to catch up very fast for this kind of migrations, and an Airflow installation generally has a lot of those.

@potiuk
Copy link
Member Author

potiuk commented Jan 17, 2023

Agree.

@kaxil
Copy link
Member

kaxil commented Jan 28, 2023

I think we should support both 1.4 and 2.0 for at least one minor release, preferrably much longer. Dependencies don’t tend to catch up very fast for this kind of migrations, and an Airflow installation generally has a lot of those.

Fully agreed, we need to support both

@infohash
Copy link
Contributor

infohash commented Mar 6, 2023

What can be the expected version that will bring SQLAlchemy 2.0 support. I am making a service that will use SQLAlchemy 2.0 APIs. I will also integrate Airflow with that service. Based on ETA, I can decide whether to use SQLAlchemy 2.0 or go ahead with SQLAlchemy 1.4 and not wait for the supported version of Airflow.

@uranusjr
Copy link
Member

uranusjr commented Mar 7, 2023

It can be the next version if you help out on it 🙂

@auvipy
Copy link
Contributor

auvipy commented Mar 11, 2023

I want to contribute here

@auvipy
Copy link
Contributor

auvipy commented Mar 11, 2023

SQLA 1.4 & 2.0 both can be supported at the same time.

@moiseenkov
Copy link
Contributor

moiseenkov commented Jul 25, 2023

Hi everyone,
Do we have some kind of board or issues list that are currently in progress? I see the work is going on, so I would like to help and fix some part that nobody else is working on at the moment?
@potiuk

@VladaZakharova
Copy link
Contributor

@kaxil @phanikumv @uranusjr @hussein-awala
Maybe you could help with coordination here? Do you have any plan for the process here?
Thanks!

@phanikumv
Copy link
Contributor

Hi @VladaZakharova - we are currently suspending this effort till 2.7 is released. We have already completed the changes, but will keep the PR in draft mode until then. The other pending change is to fix the tests so that they use the sqlalchemy 2.0 style (but again they need to wait till 2.7 is released)

cc @jedcunningham

@VladaZakharova
Copy link
Contributor

Hi Team! How is the progress going in this one? :)

@Taragolis
Copy link
Contributor

Potentially if you able to run Airflow without FAB, the airflow potentially could starts with sqlalchemy 2. I’ve analyze Airflow warnings recently and unable to find any sqlalchemy.exc.RemovedIn20Warnings from the Airflow. Only FAB

@Taragolis
Copy link
Contributor

Query API at that moment marked as legacy, and do not raise any warnings about future removal and there is become record about that this API will not removed from sqlalchemy

@moiseenkov
Copy link
Contributor

moiseenkov commented Apr 16, 2024

Excluding providers (FAB and openlineage) I managed to catch two warnings in the core:

airflow/www/views.py:845: [W0513(warning), ] The "columns" argument to Select.with_only_columns(), when referring to a sequence of items, is now passed as a series of positional elements, rather than as a list.  (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
airflow/models/taskinstance.py:1870: [W0513(warning), ] Using strings to indicate column or relationship paths in loader options is deprecated and will be removed in SQLAlchemy 2.0.  Please use the class-bound attribute directly. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

So, there are only these two places left, and we will have this problem resolved on the Airflow core side.
Is anybody working on this at the moment?

@Taragolis
Copy link
Contributor

I've add collect this warning during the CI step, however our warning collection system collect warnings only during run tests, and errors might happen during initial configurations, so this might be other incompatibilities, i would plan to extend our plugin and collect in the other steps but it required some time.

I have add all currently found warnings in Issue detail, so fill free to fix it

@dondaum
Copy link
Contributor

dondaum commented Apr 20, 2024

Hi. I would like to support. To clarify is it correct that all warnings can be found in this issue detail section from potiuk?

@veinkr
Copy link

veinkr commented May 5, 2024

Hi, @dondaum , I saw you said that all RemovedIn20Warning should be fixed in PR #39299 , does that mean we can upgrade the sqlalchemy to 2.0 version safely?

@Taragolis
Copy link
Contributor

Not really, see this comment #28723 (comment)

@veinkr
Copy link

veinkr commented May 5, 2024

@Taragolis , sorry to bother, I just checked the Flask-AppBuilder repo, it haven't updated since two months ago, and it seems that it blocked by dpgaspar/Flask-AppBuilder#2038 (comment) so that it can't update to flask-sqlalchemy >=3.
Seems that we can't upgrade to sqlalchemy to 2.0 in a short time, do we have other plan for it?
I can see that in previous PR, the pandas have been limited to pandas<2.2, which is a important library for data engineer, if we keep sqlalchemy<2.0, then we will behind too much things.

@Taragolis
Copy link
Contributor

The strategic, maybe even since Airflow 2.0, is get rid of FAB in Airflow at all.
It moved into the separate provider since 2.9, however all Authorisation still happen go through the FAB.

There is no ETA when it happen, more possible that in Airflow 3.0 but maybe sooner if someone could propose how we could get rid off FAB "Here and Now"

@hamidelmaazouz

This comment was marked as duplicate.

@NathanAP

This comment was marked as duplicate.

@uranusjr
Copy link
Member

Nothing can happen before we remove FAB from core.

If you want to help remove deprecations other from FAB, please simply create a pull request to do so.

If you have no intention to help out, please do not ask questions here. Kindly visit #39593 to follow progress on FAB removal and related tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue kind:meta High-level information important to the community
Projects
None yet
Development

Successfully merging a pull request may close this issue.