Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build wheels with oldest supported numpy #3467

Merged
merged 14 commits into from
May 21, 2023

Conversation

PrimozGodec
Copy link
Contributor

@PrimozGodec PrimozGodec commented Apr 25, 2023

It seems that after #3448, Gensim's wheels are built on the newest Numpy, causing problems with backward compatibility. In my case, for example, Gensim 4.3.1 doesn't work with Numpy 1.21.6. I think it would also fix issues reported here #3097

Use the odest-supported-numpy meta-package, which will cause the wheels will be built with the oldest Numpy supported for every Python/system combination.

Side effects and other changes:

  • pip's verbosity while the building is increased to see what versions of build requirements get installed (for example, Numpy). This change is unnecessary; it can be rolled back if you prefer).
  • I temporarily remove the building of Windows ARM64 wheels since it is not supported by the oldest-supported-numpy yet. I would wait with ARM64 until oldest-supporte-numpy adds support for it. I do not see it as really important to support ARM64 since also Numpy itself does not have wheels for ARM64 for Windows. If anyone with Win ARM64 will use it, Gensim will need to build Numpy anyway. If we would like to support ARM64, we would need to write ugly conditions for Python versions for this case. Comment what you think.
  • This PR also introduces pyproject.toml. This file contains build system requirements and information, which are used by pip to build the package. There are currently placed requirements to build a Gensim package, which Cibuildwheel uses primarily to use the right version of Numpy. It is a standard procedure, and most Python packages use this file (including Numpy, SciPy, ...).

@piskvorky
Copy link
Owner

I'm pretty sure we were using oldest-supported-numpy, I remember there was a lot of pain associated with getting numpy to "work". So this must be some regression.

A fix will be most welcome!

@PrimozGodec PrimozGodec force-pushed the use-oldest-numpy branch 3 times, most recently from c8ad54f to a9d828e Compare April 26, 2023 07:07
@@ -40,12 +40,12 @@ jobs:
env:
CIBW_ARCHS_LINUX: x86_64 aarch64
CIBW_ARCHS_MACOS: x86_64 arm64
CIBW_ARCHS_WINDOWS: AMD64 x86 ARM64
CIBW_BEFORE_BUILD: pip install numpy scipy
CIBW_ARCHS_WINDOWS: AMD64 x86
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piskvorky Sorry for some regression here. I would wait with ARM64 wheels since there are not many such systems, and also, Numpy, Scipy and other main libraries do not provide Windows arm64 wheels.

oldest-supported-numpy does not support that either. I would wait for them to start supporting arm64; otherwise, we must write a few ugly conditions for Numpy ourselves.

@PrimozGodec PrimozGodec marked this pull request as ready for review April 26, 2023 10:13
CIBW_SKIP: pp* cp36-* cp37-* *-win32 *_i686 *-musllinux_*
CIBW_TEST_COMMAND: pytest -rfxEXs --durations=20 --disable-warnings --showlocals --pyargs gensim
CIBW_TEST_REQUIRES: pytest testfixtures mock
CIBW_TEST_SKIP: cp38* cp39* cp310* *_aarch64 *_arm64 *_universal2
CIBW_BUILD_VERBOSITY: 3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It provides some more verbosity (e.g. to see the NumPy package version towards the package was built).

@PrimozGodec
Copy link
Contributor Author

It is ready to merge if you agree with the changes. After the merge, I would ask for a minor release to fix packages on PyPI.

@mpenkov
Copy link
Collaborator

mpenkov commented Apr 26, 2023

To me it seems the title doesn't match the changes in the PR.

The title is "Use oldest-supported-numpy for build".
Looking at the changes, I see:

  • drop arm64 for Windows
  • increase build verbosity
  • new pyproject.toml file (what is this for?)

@PrimozGodec PrimozGodec changed the title [FIX] Use oldest-supported-numpy for build [FIX] Build wheels with oldest supported python Apr 26, 2023
@PrimozGodec
Copy link
Contributor Author

@mpenkov I changed the tile and the description above. The main purpose of the PR is to fix wheel building to use the oldest supported numpy for building. Currently, it uses the newest and it causes issues like this #3097. Maybe the title was not the most accurate but was not far.

Those three lines that you mention are more like side effects. Please take a look in the description on the top I added more information there.

@piskvorky
Copy link
Owner

piskvorky commented Apr 27, 2023

@mpenkov looks like the latest Gensim release is broken for many users (and the displayed build badge on github is red = failed too). WDYT – a quick call?

@mpenkov
Copy link
Collaborator

mpenkov commented Apr 27, 2023

Sure, let's have a chat during the usual evening JST timeslot.

@mpenkov
Copy link
Collaborator

mpenkov commented Apr 27, 2023

Those three lines that you mention are more like side effects. Please take a look in the description on the top I added more information there.

@PrimozGodec Not sure what's going on here. These three items are the only thing I'm seeing in this PR. I don't see any other changes. Is github playing tricks on me, or did you perhaps forget to push something?

@PrimozGodec
Copy link
Contributor Author

@PrimozGodec Not sure what's going on here. These three items are the only thing I'm seeing in this PR. I don't see any other changes. Is github playing tricks on me, or did you perhaps forget to push something?

@mpenkov No, everything is in here. So the important change is adding oldest-supported-numpy in build requirements (it was not here before). This package ensures that Gesnim wheels are now built with the oldest supported Numpy for each system/python combination.

So if you take a look at logs in wheel building GH action prior to this PR, you would notice that Numpy 1.24.3 is used, which is the newest, and it is causing problems (like those here #3097) for users with the older versions of Nympy in their environments.

And if you take a look at the logs of the wheel build GH actions in this PR, you will see that older (oldest supported by Numpy at each system/python version) versions of Numpy are used for wheel build, which solves problems for users with older Numpys in their environments.

You already had something like this implemented manually before and was regressed in #3448.

@piskvorky piskvorky changed the title [FIX] Build wheels with oldest supported python Build wheels with oldest supported numpy Apr 29, 2023
@piskvorky
Copy link
Owner

piskvorky commented Apr 29, 2023

It looks like the new pyproject.toml file duplicates some of the parts from setup.py – setting the numpy version, scipy version (missing in toml but present in setup.py), etc.

Any way to avoid this duplication?

This is a recipe for trouble when the two files get out of sync. @PrimozGodec do we need both?

Unless pyproject.toml is really needed, I'm -1 on complicating the build. What's wrong with keeping the necessary information in just one place = setup.py?

@PrimozGodec
Copy link
Contributor Author

@piskvorky I kind of missed that you have setup_requires in setup.py since some requirements were also specified in the workflow file (numpy and scipy). Here is the reason why Python decided to switch to a separate file for build requirements: https://peps.python.org/pep-0518/#rationale

I agree it is not good to have those requirements in two places, so we will need to decide where to store them. Please let me know what you prefer. I can implement both, except that having them in setup.py will require installing Numpy beforehand, as it was now in the build-wheel.yaml file (line 44, which I removed) since it is imported in setup.py. It also has some kind of duplication. Having pyproject.toml avoids it.

@piskvorky
Copy link
Owner

I have no preference between setup.py and pyproject.toml, as long as things are consistent and localized.

In fact, I'm slightly in favour of the "official recommended method" (toml), because it should be easier to maintain going forward.

My only worry is that a major refactoring will miss some subtle edge case (our setup.py is battle-scarred over the years) and lead to more trouble, more work, more frustrated users.

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented May 3, 2023

Now I moved all build requirements from setup.py to pyproject.toml. You can check what the changes are. I see two problems here:

  • After the change, Cython is installed every time for building (even if it is not used - need_cython() is False). I do not see it as a problem. It is not a big package and doesn't have dependencies.
  • You currently enable setting different versions of Cython through environment variables which is impossible through pyproject.toml. Suppose we want to have requirements in pyproject.toml I see an option that version of Cython is not fixed to a specific version, and then environment variable is not required since the building process could then choose different versions if the latest one does not support a specific platform (in case like this Allow overriding the Cython version requirement #3323). What is actually the reason for strictly limiting the Cython version? It would be better to omit versions that are not ok (with != or >. You have automated tests which would point out if there are any potential problems with a specific version. What do you think?

The positive thing about having requirements in pyproject.toml is that you simplify or remove CustomBuildExt since, as far as I understand, it is there since Numpy and Cython couldn't be imported before running setup.py with pyproject.tompl, they could be.

If you think the proposed changes are too much, I would rather not use pyproject.toml then.

@piskvorky
Copy link
Owner

piskvorky commented May 3, 2023

Thanks for investigating. IIRC requiring Cython was a pain because of Windows. The build system there is weird and difficult.

Unless something changed in the Python ecosystem since I last looked, requiring Cython (which we don't need at all) is a no-go. What do you mean "for building" – just internal CI builds, or when users build/install Gensim from source with pip?

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented May 4, 2023

Unless something has changed in the Python ecosystem since I last looked, requiring Cython (which we don't need at all) is a no-go.

I do not propose removing the Cython; just considering the option of not having it fixed to a specific version. Is the version fixed because of problems you had with Windows? If we want to keep the fixed Cython version and the option to set it via environment variable, I suggest keeping everything in setup.py and waiting with pyproject.toml for some later time. Please let me know your thoughts, and I can change this PR accordingly.

What do you mean "for building" – just internal CI builds, or when users build/install Gensim from source with pip?

For both pip and CI. Actually, pip install uses the same building process that is used in CI for building wheels.

@piskvorky
Copy link
Owner

piskvorky commented May 4, 2023

I checked setup.py's git history, and the option to override the Cython version was recently added by @pabs3 in #3323.

You can read about his motivation there; I don't personally use this override so it's best to talk directly to Paul if needed.

@pabs3 what's your experience with Debian vs pyproject.toml?

I do not propose removing the Cython

You propose adding it, not removing it. If we don't need Cython at all why should we force users to install it? Looks like a regression to me: Gensim required Cython for a while, but then we managed to get rid of the dependency (IIRC – @mpenkov please correct me if I'm wrong).

@pabs3
Copy link
Contributor

pabs3 commented May 4, 2023 via email

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented May 11, 2023

@mpenkov, it is a good idea to add tests for wheels. Anyway, I just checked the failing test, and I am curious to know how it is possible that it failed. On Windows/Python 3.10 oldest-supported-numpy use Numpy 1.21.6. Which is installed here; also, when I checked the building GH Actions same version was used for building (1.21.6) of Windows python3.10 wheel.

The error RuntimeError: module compiled against API version 0xf but this version of numpy is 0xe is saying that it is installed on Numpy=1.21.* (e), but that wheel was built with numpy=1.22.* (f).

@mpenkov
Copy link
Collaborator

mpenkov commented May 11, 2023

Yeah, I was wondering about the same thing. Still haven't come up with an explanation. Let me know if you think of something.

@mpenkov mpenkov added this to the 4.3.2 milestone May 14, 2023
@mpenkov
Copy link
Collaborator

mpenkov commented May 14, 2023

@nikaro Are you able to look at this? It looks like this problem got introduced alongside cibuildwheel in #3448.

I'm still struggling to understand how this could be happening. For Python 3.10 on Windows:

  1. We build with oldest-supported-numpy, which is 1.21.6 (can confirm from the log)
  2. We test with oldest-supported-numpy, which again is 1.21.6
  3. The test fails (WTF?)

I'm starting to suspect the problem may be outside of gensim, either in oldest-supported-numpy or even numpy itself.

@mpenkov
Copy link
Collaborator

mpenkov commented May 19, 2023

@piskvorky I don't see us making any quick progress on this last remaining wheel (Python 3.10 on Windows). I think we should discuss contingencies:

  1. Release without the wheel
  2. Release a broken wheel and add a note in the release notes
  3. Anything else?

The main problem is that the broken wheel is for a platform that none of the developers involved appear to use (I personally apply effort to avoid Windows whenever possible).

@piskvorky
Copy link
Owner

piskvorky commented May 19, 2023

I'd go with "Release without win-py3.10 wheel". Or perhaps "Release without that wheel and add a note in the release notes".

It is a curious mystery for sure.

This reverts commit 69c33bf.
This reverts commit a6cdaa0.
This reverts commit eef4177.
This reverts commit 8f63e7b.
@mpenkov mpenkov merged commit eb98bf3 into piskvorky:develop May 21, 2023
@mpenkov
Copy link
Collaborator

mpenkov commented May 21, 2023

Merged. Thank you @PrimozGodec !

@PrimozGodec PrimozGodec deleted the use-oldest-numpy branch May 22, 2023 06:21
@PrimozGodec
Copy link
Contributor Author

@mpenkov and @piskvorky, can I request a new release at PyPI? Wheels in the current release are still broken (they do not work with older versions of Numpy).

@piskvorky
Copy link
Owner

@mpenkov a new release sounds reasonable… let me know when you have a ~1h window. We can release over a call, also catch up.

@PrimozGodec
Copy link
Contributor Author

@piskvorky thank you for a fast response. :)

@PrimozGodec
Copy link
Contributor Author

@piskvorky Sorry for asking again, but are there any plans for release? Is anything preventing the new release (do you need any help)?
The release would be welcome, mainly because of the fixes from this PR.

@piskvorky
Copy link
Owner

@PrimozGodec yes, we're on it. We went over the necessary steps with @mpenkov , I believe the release is imminent.

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 12, 2023

@PrimozGodec Previously, the tests for the Windows Py3.10 wheels were failing. Now, the Ubuntu wheels are also failing to build. If you can have a look, that would be helpful.

https://github.com/RaRe-Technologies/gensim/actions/runs/5803283551

@PrimozGodec
Copy link
Contributor Author

PrimozGodec commented Aug 14, 2023

Hmm, it looks strange. There are two builds

I also tried to run the test locally and cannot reproduce it.

@PrimozGodec
Copy link
Contributor Author

@mpenkov It looks that those tests are failing a bit randomly (sometimes fail, sometimes not). It looks like a problem on Ubuntu. Do you have any Linux machines to test it out (I don't have one)?

Another thing that I noticed is that tests (regular CI tests) run much slower now. Before, they ran less than 10 minutes on Ubuntu, and now it takes more than 30min (they stop at 30% after 30 minutes).

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 21, 2023

Another thing that I noticed is that tests (regular CI tests) run much slower now. Before, they ran less than 10 minutes on Ubuntu, and now it takes more than 30min (they stop at 30% after 30 minutes).

This is concerning. It sounds like a regression.

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 22, 2023

I cannot reproduce the slowdown locally, when building from source. The tests all pass within 5-6 minutes.

However, if I download one of the built wheels (x86_64 linux), install it, and run the tests, then I do get a significant slowdown running the tests. So:

  1. The slowdown affects the built wheels
  2. We need to identify the cause and resolve it before performing a release

If anyone is able to help with 2), you are most welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants