Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail installation on conflicting package metadata #7179

Closed
chrahunt opened this issue Oct 12, 2019 · 12 comments
Closed

Fail installation on conflicting package metadata #7179

chrahunt opened this issue Oct 12, 2019 · 12 comments
Assignees
Labels
C: dependency resolution About choosing which dependencies to install kind: backwards incompatible Would be backward incompatible state: awaiting PR Feature discussed, PR is needed type: deprecation Related to deprecation / removal.

Comments

@chrahunt
Copy link
Member

What's the problem this feature will solve?

Currently pip outputs a warning if package metadata does not align with the package name indicated in a direct URL or #egg= fragment.

Examples:

$ echo "from setuptools import setup; setup(name='hello')" > setup.py
$ pip install "file://$PWD#egg=world"
Processing /tmp/user/1000/tmp.lNdCZnnvx5
  WARNING: Generating metadata for package world produced metadata for project name hello. Fix your #egg=world fragments.
Installing collected packages: hello
  Running setup.py install for hello ... done
Successfully installed hello-0.0.0
$ pip install "world @ file://$PWD"
Processing /tmp/user/1000/tmp.lNdCZnnvx5
  WARNING: Generating metadata for package world produced metadata for project name hello. Fix your #egg=world fragments.
Installing collected packages: hello
  Found existing installation: hello 0.0.0
    Uninstalling hello-0.0.0:
      Successfully uninstalled hello-0.0.0
  Running setup.py install for hello ... done
Successfully installed hello-0.0.0

The warning is traced from here.

Describe the solution you'd like

I would like this to be a hard failure, and I think we should apply the same policy across-the-board with respect to metadata. If information acquired at one stage in package processing (whether provided by the user or derived some other way) doesn't match the metadata determined at a later stage for the same package, then the whole installation process should fail.

The kinds of metadata I had in mind specifically were:

  • Name (from direct URL, #egg= fragment, or simple API)
  • Version (from #egg= fragment or simple API)
  • Python/platform compatibility (from wheel tags in filename or data-requires-python in simple API)

Taking this approach would ensure that the dependency resolution in #988 isn't made more complicated or delayed due to backwards-compatibility concerns.

Alternative Solutions

If we can't fully trust the metadata other than from the package itself, then either:

  1. We must support back-tracking to before the assumption about the package data was used - so wasting time downloading such packages and potentially complicating dependency resolution logic, or
  2. We must not trust any metadata except what the package itself outputs (via setup.py egg_info, prepare_metadata_for_build_wheel, or *.dist-info/METADATA in a wheel) - very time-consuming as we have to download all packages, or
  3. We say it's undefined behavior and continue tracing a warning - bad UX since the warning can be easily missed compared to a failure

Additional context

@chrahunt chrahunt added kind: backwards incompatible Would be backward incompatible C: dependency resolution About choosing which dependencies to install state: needs discussion This needs some more discussion labels Oct 12, 2019
@xavfernandez
Copy link
Member

Concerning the name, it was initially the case in #3153 but we feared it would break too many things.
The warning being present since pip 8, I'm on board to switching it to a formal DeprecationWarning in next version (20.0 ?) and failing in the following version.

Concerning the version, it also makes sense (and it could be included in the same DeprecationWarning in pip 20.0.

Concerning the Python/platform it's less clear what you have in mind:

  • what could contradict the wheel tags ?
  • concerning the inaccurate data-python-requires, I see 2 cases:
    • the advertised python-requires made the resolver think the package would be compatible and the package is indeed compatible (for example, Python 3.8, data-python-requires=">=3.5" while the true python_requires from the wheel is ">=3.6") => should we abort ?
    • the advertised python-requires made the resolver think the package would be compatible and wasn't: this is somewhat equivalent with the case where the simple API did not provide information and the resolver discovers the incompatibility after getting the wheel. The resolver would try another version with or without the faulty data-python-requires

@chrahunt
Copy link
Member Author

what could contradict the wheel tags ?

PEP 427 mentions that {distribution}-{version}.dist-info/WHEEL contains Tag - I don't know if we're actually doing anything with this right now.

  • the advertised python-requires made the resolver think the package would be compatible and the package is indeed compatible (for example, Python 3.8, data-python-requires=">=3.5" while the true python_requires from the wheel is ">=3.6") => should we abort ?
  • the advertised python-requires made the resolver think the package would be compatible and wasn't: this is somewhat equivalent with the case where the simple API did not provide information and the resolver discovers the incompatibility after getting the wheel. The resolver would try another version with or without the faulty data-python-requires

IMO, barring some very widespread issue, any lack of consistency is enough to justify aborting installation. In my mind the error message would clearly indicate the culprit package and working around it as a user would require explicitly excluding the offending version (via constraints/requirements) and ideally reporting the issue to the package maintainer who would then yank that specific version. This is already the case right now - if a user is not pinning to a specific version/hash and a package maintainer releases a new version that has any issue, then that breaks everyone until the package is yanked or an updated version released.

Noting that doing this responsibly would require implementing build system requirement overrides (for build dependencies of dependencies) like mentioned here.

@xavfernandez
Copy link
Member

what could contradict the wheel tags ?

PEP 427 mentions that {distribution}-{version}.dist-info/WHEEL contains Tag - I don't know if we're actually doing anything with this right now.

👍 I forgot that detail and we currently do not check the content of this file as far as I know.
Not sure we would want to add this check just for the sake of catching a botched attempt to masquerade a wheel as an other but why not 🤷‍♂️ .

IMO, barring some very widespread issue, any lack of consistency is enough to justify aborting installation. In my mind the error message would clearly indicate the culprit package and working around it as a user would require explicitly excluding the offending version (via constraints/requirements) and ideally reporting the issue to the package maintainer who would then yank that specific version. This is already the case right now - if a user is not pinning to a specific version/hash and a package maintainer releases a new version that has any issue, then that breaks everyone until the package is yanked or an updated version released.

Noting that doing this responsibly would require implementing build system requirement overrides (for build dependencies of dependencies) like mentioned here.

IMO, the point of crashing for inconsistent name & version is clear:
when pip is told to pip install git+https://github.com/pypa/pip@19.3#egg=setuptools==41.4.0, there is a clear contradiction and pip has no way what should be done; In the face of ambiguity, refuse the temptation to guess.

For the wheel tag & python_requires cases, I find the gain less clear.
I'm not opposed to it but why check the tags in the WHEEL file and not the file hashes in RECORD :)
Simplifying the resolver work would be a good reason but I'm not convinced it will.

In summary, I'm totally on board for name & version checks and currently somewhat lukewarm for additional checks :)

@chrahunt chrahunt added state: awaiting PR Feature discussed, PR is needed type: deprecation Related to deprecation / removal. labels Nov 4, 2019
@pradyunsg
Copy link
Member

pradyunsg commented Feb 26, 2020

@pfmoore @uranusjr related to #7744 #7758 (comment)

@pradyunsg pradyunsg added C: dependency resolution About choosing which dependencies to install and removed C: dependency resolution About choosing which dependencies to install labels Feb 26, 2020
@brainwane
Copy link
Contributor

@pradyunsg @pfmoore @uranusjr My current reading is that this is not something we must resolve before releasing 20.2b2, so I'm removing it from the "to do before beta" column in the Resolver Rollout board; please correct if I'm wrong.

@uranusjr
Copy link
Member

uranusjr commented Jun 24, 2020

Just to note, the new resolver already emits a nice-ish error message for name and version mismatches, implemented in #8476. The Requires-Python vs data-requires-python case needs more thought, but I agree this does not need to block the beta release, since the current implement does block out incompatible packages (it just does not strictly check whether the values match).

@pradyunsg
Copy link
Member

pradyunsg commented Aug 18, 2020

If anyone is opposed to making mismatching python-requires into a straight up hard-error, now would be a great time to let us know. :)

If no one raises concerns by the end of this week, I'll move forward to make this happen.


The kinds of metadata I had in mind specifically were:

  • Name (from direct URL, #egg= fragment, or simple API)
  • Version (from #egg= fragment or simple API)
  • Python/platform compatibility (from wheel tags in filename or data-requires-python in simple API)

Notes for future me: The new resolver does handle the first two cases currently, and makes them a hard error.

For the last one, I'm not 100% sure what mismatching wheel tags could look like. For the data-python-requires situation, it should be sufficient to raise a MetadataInconsistent exception from _get_requires_python_specifier if we decide to go down the route of making this a hard error as well.

@pradyunsg pradyunsg self-assigned this Aug 18, 2020
@brainwane brainwane removed the state: needs discussion This needs some more discussion label Sep 2, 2020
@khommel
Copy link

khommel commented Dec 11, 2020

FWIW this broke the build on a legacy project that I maintain. It uses a package that has not changed in many years, but happens to have a different name in the metadata from the package name.

I have very shallow knowledge when it comes to packaging and installing. I have not found any way to disable this error directly, but I found I could prevent it by changing the requirements line from something like -e git+https://github.com/<user>/foo#egg=foo to git+https://github.com/<user>/foo. But I don't know the implications of dropping #egg from the requirements line. Is this a footgun move?

@uranusjr
Copy link
Member

uranusjr commented Dec 11, 2020

The #egg= part (or its modern equivalent, PEP 508 foo @ url syntax) is a “hint” for the resolver so it knows the URL is used to satisfy foo (there are other historical meanings that are not relevant anymore). It is not needed for top-level requirements in the new resolver because the new resolver is smart enough to download the package and inspect the name automatically. In stead of removing it altogether, you can also just change the #egg= value to match the actual package name.

@uranusjr
Copy link
Member

With #9264, the only thing undone (and undecided) here is whether we should reject a package if its data-requires-python and Requires-Python: do not match. I’m leaning toward let’s not bother, since the Python requirement is only an installation check, and the mismatch is not consequential.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 12, 2020

I’m in agreement. Let’s wrap it up.

@pradyunsg
Copy link
Member

I misclicked but feel free to reopen lol

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: dependency resolution About choosing which dependencies to install kind: backwards incompatible Would be backward incompatible state: awaiting PR Feature discussed, PR is needed type: deprecation Related to deprecation / removal.
Projects
None yet
Development

No branches or pull requests

6 participants