Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Python 3.11 #30

Open
potiuk opened this issue Oct 26, 2022 · 24 comments
Open

Support Python 3.11 #30

potiuk opened this issue Oct 26, 2022 · 24 comments

Comments

@potiuk
Copy link

potiuk commented Oct 26, 2022

Currently when you try to install sasl on Python 3.11, the compilation fails with:

#56 513.0   × Running setup.py install for sasl did not run successfully.
#56 513.0   │ exit code: 1
#56 513.0   ╰─> [31 lines of output]
#56 513.0       running install
#56 513.0       /usr/local/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
#56 513.0         warnings.warn(
#56 513.0       running build
#56 513.0       running build_py
#56 513.0       creating build
#56 513.0       creating build/lib.linux-x86_64-cpython-311
#56 513.0       creating build/lib.linux-x86_64-cpython-311/sasl
#56 513.0       copying sasl/__init__.py -> build/lib.linux-x86_64-cpython-311/sasl
#56 513.0       running egg_info
#56 513.0       writing sasl.egg-info/PKG-INFO
#56 513.0       writing dependency_links to sasl.egg-info/dependency_links.txt
#56 513.0       writing requirements to sasl.egg-info/requires.txt
#56 513.0       writing top-level names to sasl.egg-info/top_level.txt
#56 513.0       reading manifest file 'sasl.egg-info/SOURCES.txt'
#56 513.0       reading manifest template 'MANIFEST.in'
#56 513.0       adding license file 'LICENSE.txt'
#56 513.0       writing manifest file 'sasl.egg-info/SOURCES.txt'
#56 513.0       copying sasl/saslwrapper.cpp -> build/lib.linux-x86_64-cpython-311/sasl
#56 513.0       copying sasl/saslwrapper.h -> build/lib.linux-x86_64-cpython-311/sasl
#56 513.0       copying sasl/saslwrapper.pyx -> build/lib.linux-x86_64-cpython-311/sasl
#56 513.0       running build_ext
#56 513.0       building 'sasl.saslwrapper' extension
#56 513.0       creating build/temp.linux-x86_64-cpython-311
#56 513.0       creating build/temp.linux-x86_64-cpython-311/sasl
#56 513.0       gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Isasl -I/usr/local/include/python3.11 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-cpython-311/sasl/saslwrapper.o
#56 513.0       sasl/saslwrapper.cpp:196:12: fatal error: longintrepr.h: No such file or directory
#56 513.0         196 |   #include "longintrepr.h"
#56 513.0             |            ^~~~~~~~~~~~~~~
#56 513.0       compilation terminated.
#56 513.0       error: command '/usr/bin/gcc' failed with exit code 1
#56 513.0       [end of output]
#56 513.0   

I know it is eaarly (Python 3.11 has just been released yesterday) but we are hoping in Apache Airflow to a much faster cycle of adding new Python releases - especially that Pyhon 3.11 introduces huge performance improvements (25% is the average number claimed) due to a very focused effort to increase single-threaded Python performance (Specialized interpreter being the core of it but also many other improvements) without actually changing any of the Python code.

I'd appreciate if someone in the cloudera team attempted to fix it. Otherwise we might want to simply skip hive provider from Python 3.11 compatible version of Airflow.

I just opened a PR in Apache Airflow yesterday and plan to keep it open until it gets green :). So far I have to exclude hive provider.

apache/airflow#27264

I think it would be fantastic if we could as the open source community migrate to the new Python much faster.

Looking forward to cooperation on that one :)

potiuk added a commit to apache/airflow that referenced this issue Oct 26, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this issue Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this issue Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this issue Oct 31, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@gughy8
Copy link

gughy8 commented Nov 11, 2022

@potiuk: I added a #31 for python3.11.

potiuk added a commit to apache/airflow that referenced this issue Nov 24, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to potiuk/airflow that referenced this issue Jan 19, 2023
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: apache#27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@MarechJ
Copy link

MarechJ commented Jan 25, 2023

bumping this up. I would review the MR myself but unfortunately it's a bit beyond my depth.

@potiuk
Copy link
Author

potiuk commented Mar 5, 2023

I hope someone from cloudera (@attilajeges ? ) might actually review and merge the PR #31 and release python-sasl. I am forced to remove hive provider because of this problem from our Python 3.11 releases.

@Hansz
Copy link

Hansz commented Apr 6, 2023

Bump ... running into the same problem with "longintrepr.h" ... Would appreciate this fix.

@mdeshmu
Copy link

mdeshmu commented May 12, 2023

Ping, Please look into PR 31

@guo-steve
Copy link

guo-steve commented May 14, 2023

I am installing localstack[runtime] and it failed because of this error, not sure whether there is a workaround?

@potiuk
Copy link
Author

potiuk commented May 16, 2023

Just a bit of a warning here for Python-sasl maintainers. We are very close to have Python 3.11 support in Apache Airflow - we are just about to merge the Google Provider upgrading ~20 client libraries which was the biggest blocker, and ApacheBeam released 2.47.0 version for Python 3.11. Similarly as it happened alredy with yandex provider - we are going to suspend the hive provider from our releases if Papermill will be holding us back.

We have a process for that described in our process: https://github.com/apache/airflow/blob/main/PROVIDERS.rst#suspending-releases-for-providers - you can also learn there what are the consequences of being suspended (in short - no new releases of the provider until the problem is removed, papermill will be removed from "airflow" extras in the next minor relese of Airflow).

The first step of the process is to let the maintainers of the provider library that holds us back, which is happending via this comment.

Apparently there is a PR already in your repo to prevent it and add 3.11 support, and there is about a week to before we attempt to merge 3.11 charge with suspended the Apache Hive provider so in case you would like to avoid the suspension, there is a about a week to get the python-sasl release that will support Python 3.11.

@potiuk
Copy link
Author

potiuk commented May 16, 2023

Also announced in airflow devlist https://lists.apache.org/thread/0dcvjj0f6bnjg3mk4zn32stjbxtprb5j so in case you have something to add, comment etc - feel free.

@potiuk
Copy link
Author

potiuk commented May 22, 2023

FYI . We just merged 3.11 support. The Apache.Hive provider has been excluded from Python 3.11 until this issue is fixed.

@mdeshmu
Copy link

mdeshmu commented May 23, 2023

@potiuk
Does Airflow Hive provider use PyHive?
FYI, I have raised a PR for PyHive to support another working library pure-sasl.

@eladkal
Copy link

eladkal commented May 23, 2023

Does Airflow Hive provider use PyHive?

Yes but Pyhive is discontinued
https://github.com/dropbox/PyHive#project-is-currently-unsupported

We likely drop hive integration eventually as there is no other Python SDK for it

pankajastro added a commit to astronomer/astronomer-providers that referenced this issue Jul 14, 2023
Run unit tests for Python3.11
exclude hive extra from it because of issue cloudera/python-sasl#30
@mdeshmu
Copy link

mdeshmu commented Jul 14, 2023

@potiuk @eladkal

I made a couple of contributions to PyHive which were accepted and released in 0.7.1.dev0. You are requested to test with the dev version and report any bugs in PyHive github repository before 0.7.1 is released in a month or so.

  1. PyHive also supports pure-sasl via additional extras 'pyhive[hive_pure_sasl]' which supports Python 3.11 in addition to previous Python versions. See Use pure-sasl in python 3.11 dropbox/PyHive#454
  2. PyHive is now compatible with SQLAlchemy 2.0. See Adding compatibility with SQLAlchemy 2.0 dropbox/PyHive#457

@potiuk
Copy link
Author

potiuk commented Jul 14, 2023

Fantastic! Thanks ! That will be great to have it released and add Hive back to be supported in 3.11
I just run created a draft to test it -> PR apache/airflow#32607

If all goes right, this will both, test installability, cross-dependencies of PyHive and pip resolvabilty on 3.11 as well as run unit tests for it, which should give us good confidence that it works.

Fingers crossed it wil be green

Re:

is pure-sasl workig with whole range Py 3.8 - 3.11 ? Because if so, I would gladly switch to pure-sasl instead for Airflow we already use pure-sasl so . I will run another PR to check it after. We had our dose of problems with sasl (see below comment):

in case of Python 3.9 sasl library needs to be installed with version higher or equal than
0.3.1 because only that version supports Python 3.9. For other Python version pyhive[hive] pulls
the sasl library anyway (and there sasl library version is not relevant)

  • sasl>=0.3.1; python_version>="3.9"

That's a good timing. We are completling rewriting all our queries to SQLAlchemy 2.0 too, so having Hive supporting it too, would remove a blocker for our migration.

@mdeshmu
Copy link

mdeshmu commented Jul 14, 2023

@potiuk
Yes, pure-sasl officially supports Py 3.8 - 3.10.
I have personally tested with Py 3.9 - 3.11 and it works.

@potiuk
Copy link
Author

potiuk commented Jul 14, 2023

Cool. Updated the PR with switching to pure-sasl as well. Let's see :)

@eladkal
Copy link

eladkal commented Jul 14, 2023

@potiuk @eladkal

I made a couple of contributions to PyHive which were accepted and released in 0.7.1.dev0.

This is awesome!
And surprising giving Pyhive mention in thier repo that the project is discontinued
https://github.com/dropbox/PyHive#project-is-currently-unsupported

@potiuk
Copy link
Author

potiuk commented Jul 14, 2023

Looks like we are close enough - we had some intermittent issues on apache/airflow#32607 - re-run it to see but images for 3.11 were built, and tests passed, so looks like things are good from Airlflow side (I re-run the failing jobs tests to verify).

@mdeshmu
Copy link

mdeshmu commented Jul 15, 2023

Awesome! Thanks for testing @potiuk.

@potiuk
Copy link
Author

potiuk commented Jul 15, 2023

Yep. it looks good.

The constraint job is failing because we have dev dependency and == there, once the package is released and we switch it to regular >=, it will start working.

@potiuk
Copy link
Author

potiuk commented Aug 2, 2023

Awesome! Thanks for testing @potiuk.

Are there any expectations on 0.7.1 to be released :) ? (A month or so was 3 weeks ago so, just checking what to expect :) )

pankajastro added a commit to astronomer/astronomer-providers that referenced this issue Aug 2, 2023
* Run test for Python3.11

Run unit tests for Python3.11
exclude hive extra from it because of issue cloudera/python-sasl#30
@mdeshmu
Copy link

mdeshmu commented Aug 6, 2023

Are there any expectations on 0.7.1 to be released :) ? (A month or so was 3 weeks ago so, just checking what to expect :) )

The PyHive repo maintainer with whom I am co-ordinating, is on vacation. He will release 0.7.1 once he is back from vacation. It will take a couple of weeks more.

@potiuk
Copy link
Author

potiuk commented Aug 6, 2023

OK. Cool. Thanks. We just release new version of the Hive provider in two days, so I was just checking :)

@mdeshmu
Copy link

mdeshmu commented Aug 19, 2023

PyHive 0.7.0 is released which includes Python 3.11 and SQLAlchemy 2.0 support.

@potiuk
Copy link
Author

potiuk commented Aug 19, 2023

Oh... Fantastic. Thank you for letting me know. I kept on checking from time to time but being notified is even cooler :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants