Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removed scipy from build-only dependencies #3538

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ on Wikipedia.
Installation
------------

This software depends on [NumPy and Scipy], two Python packages for
scientific computing. You must have them installed prior to installing
This software depends on [NumPy], a Python package for
scientific computing. You must have it installed prior to installing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really have to be installed prior to installing gensim having in mind that a proper numpy version is already specified in build-system.requires table? Meaning it will be downloaded automatically during gensim build phase.

If it does not --- what should we do with regard to the next paragraph about BLAS lib? I think it will not be customisable for any architecture that a numpy .whl build exists for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the sentence starting "You must" is incorrect. Numpy will get installed automatically if it already isn't there.

With BLAS, I think we just leave that note as-is. You can't customize it as part of the gensim install. If you want to mess with BLAS, then you have to do it yourself, before installing numpy and gensim.

Right @piskvorky ?

Copy link
Owner

@piskvorky piskvorky Jun 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot how exactly it worked back then. IIRC numpy was needed for the installation of gensim: as in numpy actively used during install, down to its C API. Maybe scipy too. I believe that is the intent behind "You must have [numpy] installed prior to installing gensim".

If pip installs numpy automatically before the gensim install itself starts, that's great. Some people don't use pip, I don't know all the deployment methods people use (from source, conda, etc) and whether they all install dependencies first.

In any case that was 15 years ago, the current ecosystem might work differently, maybe none of this is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it, the problem of linking Numpy to any BLAS lib is a matter to worry about only during building Numpy from source.
This can occur only if:

  • one wants to customise his Numpy build in a non typical way. In this case I believe such person knows what he's doing so this note is not really valuable for him,

  • one installs gensim which in turn downloads appropriate NumPy build (but src distribution instead of a .whl). Then the pip (or I believe any package manager) would try building the NumPy from source and implicitly linking an existing BLAS lib.

What's more, a NumPy installed as a gensim's dependency will not be (in some cases) the same NumPy as used for building gensim. Pip by default builds a library in a separate temporary virtual env where it installs build-only dependencies.

I've redacted the paragraph. Please review whether it sounds sensible .

gensim.

It is also recommended you install a fast BLAS library before installing
Expand All @@ -69,7 +69,9 @@ Or, if you have instead downloaded and unzipped the [source tar.gz]
package:

```bash
python setup.py install
tar -xvzf gensim-X.X.X.tar.gz
cd gensim-X.X.X/
pip install .
```

For alternative modes of installation, see the [documentation].
Expand Down Expand Up @@ -172,7 +174,7 @@ BibTeX entry:
[documentation and Jupyter Notebook tutorials]: https://github.com/RaRe-Technologies/gensim/#documentation
[Vector Space Model]: https://en.wikipedia.org/wiki/Vector_space_model
[unsupervised document analysis]: https://en.wikipedia.org/wiki/Latent_semantic_indexing
[NumPy and Scipy]: https://scipy.org/install/
[NumPy]: https://numpy.org/install/
[ATLAS]: https://math-atlas.sourceforge.net/
[OpenBLAS]: https://xianyi.github.io/OpenBLAS/
[source tar.gz]: https://pypi.org/project/gensim/
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ requires = [
# is 1.18.5, remove the line when they increase oldest supported Numpy for this platform
"numpy==1.18.5; python_version=='3.8' and platform_machine not in 'arm64|aarch64'",
"oldest-supported-numpy; python_version>'3.8' or platform_machine in 'arm64|aarch64'",
"scipy",
"setuptools",
"wheel",
]
Loading