Skip to content

Refactor specifiers to handle legacy specifiers #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 19, 2014

Conversation

dstufft
Copy link
Member

@dstufft dstufft commented Oct 6, 2014

The original specifier code assumed that the only kind of specifiers that would ever be in play are PEP 440 compliant specifiers. However it quickly came to light that there are a lot of legacy cases where a specifier cannot be cleanly parsed using PEP 440.

This changes the specifier handling code to have:

  • An abstract base class, BaseSpecifier which functions as the definition for what a specifier class must contain.
  • A LegacySpecifier which interprets all versions as a LegacyVersion. This does not implement any of the new operator types such as ~= or === and is intended to act like setuptools previously did.
  • A Specifier which interprets specifiers as defined in PEP 440.
  • A SpecifierSet which can hold one or more specifiers and attempts to sanely allow the mixed usage of both Specifier and LegacySpecifier.

A major change in behavior is that a Specifier (and now a LegacySpecifier) will only represent one single specifier and will not accept a comma separated list of them. The new SpecifierSet will handle cases where you have one or more specifiers and will attempt to sanely handle the case where you have a mixture of
Specifier and LegacySpecifier instances.

In particular, a SpecifierSet has the following behaviors:

If given no specifiers:

  • It will not match any LegacyVersion items.
  • It will not match any pre-release versions, unless there are no final releases available that match.

If given any specifiers:

  • It will match only if the given version(s) match each and every specifier contained within the Specifier set.

The side effects of this are:

  • If two specifiers are given, one that matches pre-releases and one that doesn't, then the SpecifierSet as a whole will not accept any pre-releases.
  • Even versions that can be parsed as PEP 440 will not use any of the new semantics when being interpreted by a LegacySpecifier, so something like SpecifierSet(">=1.0-1-1") will end up allowing pre-releases since LegacyVersion has no concept of them.

This fixes #12.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

/cc @jaraco @ncoghlan

I think the technique here is a winner. It gives the PEP 440 interpretations wherever it can that makes sense, however it will fall back to the older style of interpretation if it must. This does make it a bit more complicated to explain the rules of how specifiers work but I don't believe that is an avoidable situation without simply disallowing legacy specifiers completely.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

The tests are going to be broken on this, because I haven't updated them to cope with the backwards incompatible changes which this PR makes.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

Also combining SpecifierSet instances isn't yet written, though it'll work as you'd think.

@ncoghlan
Copy link
Member

ncoghlan commented Oct 6, 2014

This also paves the way to using a SAT solver to quickly choose the "best possible version" in the future.

One minor tweak: a name like "BaseSpecifier" or "SpecifierBase" is often a good choice when factoring out a common base class like this.

@ncoghlan
Copy link
Member

ncoghlan commented Oct 6, 2014

And to be completely explicit: yes, this sounds like a good approach to me.

@dstufft dstufft force-pushed the legacy-specifiers branch from 4987316 to 476992f Compare October 6, 2014 01:20
@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

I renamed the ASpecifier (was supposed to be like A(bstract)Specifier) to BaseSpecifier.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

Hmm, so a niggly thing. You can force override a SpecifierSet to allow prereleases always, but how that interacts with combining two SpecifierSet objects is kind of sticky... Right now it's:

>>> from packaging.specifiers import SpecifierSet
>>> SpecifierSet(">=1.0") & SpecifierSet(">=2.0")
<SpecifierSet('>=1.0,>=2.0')>
>>> SpecifierSet(">=1.0") & SpecifierSet(">=2.0", prereleases=True)
<SpecifierSet('>=1.0,>=2.0', prereleases=True)>
>>> SpecifierSet(">=1.0") & SpecifierSet(">=2.0", prereleases=False)
<SpecifierSet('>=1.0,>=2.0', prereleases=False)>
>>> SpecifierSet(">=1.0", prereleases=True) & SpecifierSet(">=2.0")
<SpecifierSet('>=1.0,>=2.0', prereleases=True)>
>>> SpecifierSet(">=1.0", prereleases=False) & SpecifierSet(">=2.0")
<SpecifierSet('>=1.0,>=2.0', prereleases=False)>
>>> SpecifierSet(">=1.0", prereleases=False) & SpecifierSet(">=2.0", prereleases=False)
<SpecifierSet('>=1.0,>=2.0', prereleases=False)>
>>> SpecifierSet(">=1.0", prereleases=True) & SpecifierSet(">=2.0", prereleases=True)
<SpecifierSet('>=1.0,>=2.0', prereleases=True)>
>>> SpecifierSet(">=1.0", prereleases=False) & SpecifierSet(">=2.0", prereleases=True)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/dstufft/projects/packaging/packaging/specifiers.py", line 603, in __and__
    "Cannot combine SpecifierSets with True and False prerelease "
ValueError: Cannot combine SpecifierSets with True and False prerelease overrides.

Which essentially means that:

  • If the overrides are not set at all, then we don't set it.
  • If the overrides are set on only the left or the right side we'll use the value on the side that's set.
  • If the overrides are set on the left and right side and they are equal then we'll set it to the same.
  • If the overrides are set on the left and right side and they are not equal then we'll raise a ValueError.

Does this behavior seem reasonable?

@dstufft dstufft force-pushed the legacy-specifiers branch from 476992f to 81e00ef Compare October 6, 2014 01:47
@ncoghlan
Copy link
Member

ncoghlan commented Oct 6, 2014

The combination rules sound sensible to me. It's also likely worth spending some time ensuring that the version selection algorithm can be mapped to a SAT solver, even if you don't implement it that way just yet.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

This doesn't select a version, it only filters a given list of versions down to the ones that match the specifier. A SAT solver needs boolean expressions so it'll have to expand out the logic into a big long boolean expression anyways.

@ncoghlan
Copy link
Member

ncoghlan commented Oct 6, 2014

Ah, right - I had missed that the specifier set only filtered, it didn't handle the ordering task as well (even though it's right there in the "set" name). My train of thought was based on the fact that once we have full metadata publication on PyPI, walking the dependency tree collecting specifiers and grouping them into SpecifierSet instances by distribution name before deciding which version to download would give you global constraint satisfaction, even without a SAT solver.

Anyway, +1 on the general concept from me.

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

Yea, the flter() method will retain whatever ordering you give it and you can combine SpecifierSet instances so you can get a reasonable approximation if your constraints aren't very hard. This is sort of similar to what pip has now.

The problem then comes when you have a full blown dependency tree and you have to do branching logic, like if I install foobar 1.0 that depends on wat~=3.0, but if I install foobar 0.1 that just depends on wat>=1.0. So it gets a lot more complex and you start needing to make decisions about what your policy is for selecting versions. Do you want to maximize the latest versions of everything? Do you want to have a higher "average" latest version or is a single old version OK if it means you get latest on the rest of them.

I think probably it'll never make sense for a dependency solver to be part of packaging. Even though it's a core piece of doing proper dependency resolution in any sane system, it's complex enough to be it's own topic. Ideally though it'd be able to reuse pieces from packaging to represent versions and the like. Since you have to construct a boolean constraint, you could use SpecifierSet().filter() to take a list of versions, narrow them down to the ones that satisify the specifier, and then do something like foobar_1_0 and (wat_3_0 or wat_3_1) to to represent someone doing pip install foobar==1.0 where foobar 1.0 depends on wat~=3.0 and wat 3.0 and 3.1 exist in the repository.

@ncoghlan
Copy link
Member

ncoghlan commented Oct 6, 2014

Ah, nice - thanks for the explanation :)

@dstufft
Copy link
Member Author

dstufft commented Oct 6, 2014

Course re-reading this in the morning I realize my boolean equation is wrong since it allows two different versions of wat to be installed. It'd really look more like: foobar_1_0 and ((wat_3_0 and not wat_3_1) or (wat_3_1 and not wat_3_0)). Then the SAT solver would just more or less try different values for the variables until it finds a solution that evaluates to True or it runs out of possible outcomes.

@dstufft dstufft force-pushed the legacy-specifiers branch 2 times, most recently from 164bf6e to 66e3329 Compare November 18, 2014 21:52
@dstufft dstufft changed the title [WIP] Refactor specifiers to handle legacy specifiers Refactor specifiers to handle legacy specifiers Nov 18, 2014
The original specifier code assumed that the only kind of specifiers
that would ever be in play are PEP 440 compliant specifiers. However
it quickly came to light that there are a lot of legacy cases where
a specifier cannot be cleanly parsed using PEP 440.

This changes the specifier handling code to have:

* An abstract base class, ``BaseSpecifier`` which functions as the
  definition for what a specifier class must contain.
* A ``LegacySpecifier`` which interprets all versions as a
  ``LegacyVersion``. This does not implement any of the new operator
  types such as ``~=`` or ``===`` and is intended to act like
  setuptools previously did.
* A ``Specifier`` which interprets specifiers as defined in PEP 440.
* A ``SpecifierSet`` which can hold one or more specifiers and
  attempts to sanely allow the mixed usage of both ``Specifier`` and
  ``LegacySpecifier``.

A major change in behavior is that a ``Specifier`` (and now a
``LegacySpecifier``) will only represent one single specifier and will
not accept a comma separated list of them. The new ``SpecifierSet``
will handle cases where you have one or more specifiers and will
attempt to sanely handle the case where you have a mixture of
``Specifier`` and ``LegacySpecifier`` instances.

In particular, a ``SpecifierSet`` has the following behaviors:

If given no specifiers:

* It will *not* match any ``LegacyVersion`` items.
* It will *not* match any pre-release versions, unless there are no
  final releases available that match.

If given any specifiers:

* It will match only if the given version(s) match each and every
  specifier contained within the ``Specifier`` set.

The side effects of this are:

* If two specifiers are given, one that matches pre-releases and one
  that doesn't, then the ``SpecifierSet`` as a whole will not accept
  any pre-releases.
* Even versions that can be parsed as PEP 440 will not use any of the
  new semantics when being interpreted by a ``LegacySpecifier``, so
  something like ``SpecifierSet(">=1.0-1-1")`` will end up allowing
  pre-releases since ``LegacyVersion`` has no concept of them.
dstufft added a commit that referenced this pull request Nov 19, 2014
Refactor specifiers to handle legacy specifiers
@dstufft dstufft merged commit c3fc111 into pypa:master Nov 19, 2014
@dstufft dstufft deleted the legacy-specifiers branch November 19, 2014 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to handle legacy specifiers?
2 participants