-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removed scipy from build-only dependencies #3538
removed scipy from build-only dependencies #3538
Conversation
README.md
Outdated
This software depends on [NumPy], a Python package for | ||
scientific computing. You must have it installed prior to installing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really have to be installed prior to installing gensim
having in mind that a proper numpy version is already specified in build-system.requires table? Meaning it will be downloaded automatically during gensim build phase.
If it does not --- what should we do with regard to the next paragraph about BLAS lib? I think it will not be customisable for any architecture that a numpy .whl build exists for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the sentence starting "You must" is incorrect. Numpy will get installed automatically if it already isn't there.
With BLAS, I think we just leave that note as-is. You can't customize it as part of the gensim install. If you want to mess with BLAS, then you have to do it yourself, before installing numpy and gensim.
Right @piskvorky ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot how exactly it worked back then. IIRC numpy was needed for the installation of gensim: as in numpy actively used during install, down to its C API. Maybe scipy too. I believe that is the intent behind "You must have [numpy] installed prior to installing gensim".
If pip
installs numpy automatically before the gensim install itself starts, that's great. Some people don't use pip
, I don't know all the deployment methods people use (from source, conda, etc) and whether they all install dependencies first.
In any case that was 15 years ago, the current ecosystem might work differently, maybe none of this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see it, the problem of linking Numpy to any BLAS lib is a matter to worry about only during building Numpy from source.
This can occur only if:
-
one wants to customise his Numpy build in a non typical way. In this case I believe such person knows what he's doing so this note is not really valuable for him,
-
one installs gensim which in turn downloads appropriate NumPy build (but src distribution instead of a .whl). Then the pip (or I believe any package manager) would try building the NumPy from source and implicitly linking an existing BLAS lib.
What's more, a NumPy installed as a gensim's dependency will not be (in some cases) the same NumPy as used for building gensim. Pip by default builds a library in a separate temporary virtual env where it installs build-only dependencies.
I've redacted the paragraph. Please review whether it sounds sensible .
b170395
to
1dc75de
Compare
pyproject.toml
Outdated
# oldest supported Numpy for non-'arm64|aarch64' architectures is 1.17.* | ||
# as well as for 'Windows' on arm64. The oldest supported Numpy by Gensim | ||
# is 1.18.5. Remove below line when they increase the oldest supported Numpy for those platforms | ||
"numpy==1.18.5; python_version=='3.8' and platform_machine != 'aarch64' and (platform_machine != 'arm64' or platform_system == 'Windows')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation behind changing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the meta package oldest-supported-numpy I see that numpy<1.18.5
will be downloaded for python 3.8 and Windows on arm64 (apart from all non-arm64 and non-aarch64 platforms).
So I believe this check was incomplete resulting in installing unsupported version of Numpy for python3.8 arm64 Windows users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piskvorky Do we want to deal with this now? It hasn't been reported yet, so it may not actually be a problem. It's been a while since I've messed with oldest-supported-numpy, so I'm not 100% sure this fix is needed and safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reasoning.
Should I revert this change then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's revert it. Feel free to make a separate PR with that change so we can review it later, after the current release is out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do as advised, thanks!
… Windows machines" This reverts commit 68197fb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work, thank you!
I see the builds failing on tests step (due to different scipy versions being installed for a given python version). My PR should not affect anything more than library building step (which passes successfully each time). |
Very strange. I see your previous commits built fine. I will investigate. |
@mpenkov and @filip-komarzyniec, I tried restricting scipy versions to <1.14 in setup.py, and this makes the CICD tests pass, see this run.
Edit: Well, ofcourse the difference for Python versions comes because scipy 1.14.0 supports only Python 3.10+, as stated in the release notes. Anyway, I wonder could this scipy version constraint <1.14 be applied or is there something more to it? (I took a look because our CICD pipeline is not working due to Gensim installation issues.) |
I think I see what's happening here:
|
…ove_scipy_from_build_only_deps' into filip-komarzyniec_3536_remove_scipy_from_build_only_deps
Previously I got the tests to pass by pinning to numpy<2.0 for Python 3.12 build-system, see this run: https://github.com/juhoinkinen/gensim/actions/runs/9953367669 (Ref #3541 (comment)). |
The problem with scipy is fixed, now we have to solve the problem with macos. https://github.blog/changelog/2024-05-20-actions-upcoming-changes-to-github-hosted-macos-runners/ |
Let's see if |
Alright, looks like we're green. Thanks everyone! |
Resolves: #3536
In addition to removing scipy from build-system.requires table I've also updated the README file.
Please refer to the comments for some additional doubts I have regarding more changes that might need to be introduced.