-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
full support for duplicate dependencies with overlapping markers #7257
Conversation
7813682
to
c31c559
Compare
c31c559
to
15c234d
Compare
Times for
Number of overrides:
Considering the examples, the performance of our marker arithmetic seems to be fast enough to handle this PR: There is no measurable performance regression even in cases where the number of overrides does not change. |
15c234d
to
84b52eb
Compare
PTAL @dimbleby |
84b52eb
to
e706f3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good I think, though my head now hurts a little...
Feels as though there's some danger of this being horribly inefficient if there are lots of overlapping markers to resolve. But probably folk mostly don't do that and certainly the examples showing that this is typically a win by reducing overrides is encouraging. So I guess we hope that this continues to hold when exposed to more examples from the outside world...
Made a couple of more or less pedantic code-improving suggestions.
e706f3f
to
a268daf
Compare
Here's an example in which [tool.poetry.dependencies]
python = "^3.5"
oldest-supported-numpy = "^2022.11.19" see https://github.com/scipy/oldest-supported-numpy/blob/main/setup.cfg Edit: to be fair, the existing code also doesn't do well with this one! |
I can't get a solution (within a reasonable amount of time) with or without this PR. Thus, probably not a blocker? |
I'm not sure how bad I think this is. I agree that with that example I am not getting a solution either before or after, or anyway not in an amount of time that I am willing to wait. You can use the python requirement as a knob to adjust "how many combinations of markers are there to work through?", and therefore how slow this is. eg However it is all made a bit muddier by finding that the overlapping markers produce some incompatible requirements, which the new code discovers and the old code does not. (I've opened an issue at So that's an example where this change is "breaking" (defensible as "fixing" really, but expect bug reports anyway!) - and because it's breaking I'm not sure what the performance impact really is for that case. Not sure this is helping much: I think this is an improvement, but I sure would be happier if it could be made faster! |
After fixing the requirements of
Larger ranges still took too long. Maybe, I was just lucky but the results are pretty clear... |
Interesting, do the changes you have made to oldest-supported-numpy have a negative effect on poetry performance? without the PR and also without those changes I see:
where the first of those fails to find a solution and some of the others ought to (because the code without the PR is not correct); but nevertheless these times are much faster than you are reporting. I'm all in favour of concluding that the new code is both more correct and also faster than the old code, that would be lovely. I'm just a bit confused about why we are seeing such different results. |
The changes to oldest-supported-numpy and python-poetry/poetry-core#608 both have a negative impact on performance. |
Well I still wish it was faster; but if we're not finding any reason to believe that this change makes things worse then I guess it should be merged |
I haven't profiled, so this might be a bad or irrelevant idea... I wonder whether and self._python_constraint.allows_any(
get_python_constraint_from_marker(marker)
) thought: the project python constraint could be transformed into an equivalent marker (once, up-front), and then intersected with As I say, I haven't tested whether that's definitely a slow bit that is worth optimising: and anyway it isn't completely obvious that this would even be an improvement. So don't block the merge on this idea! |
oh, but a better idea: the calculation of the intersection is currently rather slow. Much faster is to arrange it as something like this:
presumably because most of the time the intersection collapses to the empty marker pretty quickly; and once that has happened we can avoid the work of DNF-ing and suchlike on the remaining markers. |
and in fact that observation can be encouraged by putting the positive markers early on in the intersection, like this (could do with prettying up!): markers = (
dep.marker if use else dep.marker.invert()
for use, dep in sorted(
zip(uses, dependencies), key=lambda ud: ud[0], reverse=True
)
) That's relying on the heuristic that if there are a lot of markers present, then surely they most don't intersect one another. But that seems quite plausible, it would be quite hard to write them any other way |
The other suggestion looks very promising. (I have to run some additional tests later.) |
An easy win in for deps in by_constraint.values():
- new_markers = [dep.marker for dep in deps]
dep = deps[0]
- dep.marker = marker_union(*new_markers)
+ if len(deps) > 1:
+ new_markers = (dep.marker for dep in deps)
+ dep.marker = marker_union(*new_markers) no need to perform all the dnf-and-cnf complexity when trying to merge a list that only contained a single marker in the first place |
a268daf
to
596157c
Compare
I applied both suggestions and checked that there is no negative impact on the examples in #7257 (comment). Measuring #7257 (comment) again:
The only drawback of the second optimization is that the locked markers become more verbose. See |
Seems like what we're seeing here is that the performance of marker operations is still expensive, when they get sufficiently complicated. Nearly all of the time in the solving is now taken up with performing the single intersection of the final combination of markers (the case where all
However, if you have any good ideas then this still would be worth addressing.... I suspect that we are doing a lot of repeated work to normalize the same markers again and again. I don't think I'm going to get to trying anything out soon, but I wonder if there are wins to be had either by caching answers, or having markers remember that they are already in DNF (therefore another dnf() can be made a no-op). |
On their own, both appear to have a significant effect on performance. However, when caching answers, remembering that a marker is already DNF seems to only have a very small effect not worth the more complex code. There might be some cases where it's still relevant but I think I'll wait until we find one in the wild. Updated measurements:
Broader Python constraints are still too slow... Then, I switched from scipy/oldest-supported-numpy#71 to scipy/oldest-supported-numpy#74 (with scipy/oldest-supported-numpy#74 (comment)) because that's more likely to be merged:
|
596157c
to
e9c40e6
Compare
Co-authored-by: David Hotham <david.hotham@blueyonder.co.uk>
e9c40e6
to
116a232
Compare
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [poetry](https://python-poetry.org/) ([source](https://github.com/python-poetry/poetry), [changelog](https://python-poetry.org/history/)) | minor | `1.5.1` -> `1.6.1` | --- ### Release Notes <details> <summary>python-poetry/poetry (poetry)</summary> ### [`v1.6.1`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#161---2023-08-21) [Compare Source](python-poetry/poetry@1.6.0...1.6.1) ##### Fixed - Update the minimum required version of `requests` ([#​8336](python-poetry/poetry#8336)). ### [`v1.6.0`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#160---2023-08-20) [Compare Source](python-poetry/poetry@1.5.1...1.6.0) ##### Added - **Add support for repositories that do not provide a supported hash algorithm** ([#​8118](python-poetry/poetry#8118)). - **Add full support for duplicate dependencies with overlapping markers** ([#​7257](python-poetry/poetry#7257)). - **Improve performance of `poetry lock` for certain edge cases** ([#​8256](python-poetry/poetry#8256)). - Improve performance of `poetry install` ([#​8031](python-poetry/poetry#8031)). - `poetry check` validates that specified `readme` files do exist ([#​7444](python-poetry/poetry#7444)). - Add a downgrading note when updating to an older version ([#​8176](python-poetry/poetry#8176)). - Add support for `vox` in the `xonsh` shell ([#​8203](python-poetry/poetry#8203)). - Add support for `pre-commit` hooks for projects where the pyproject.toml file is located in a subfolder ([#​8204](python-poetry/poetry#8204)). - Add support for the `git+http://` scheme ([#​6619](python-poetry/poetry#6619)). ##### Changed - **Drop support for Python 3.7** ([#​7674](python-poetry/poetry#7674)). - Move `poetry lock --check` to `poetry check --lock` and deprecate the former ([#​8015](python-poetry/poetry#8015)). - Change future warning that PyPI will only be disabled automatically if there are no primary sources ([#​8151](python-poetry/poetry#8151)). ##### Fixed - Fix an issue where `build-system.requires` were not respected for projects with build scripts ([#​7975](python-poetry/poetry#7975)). - Fix an issue where the encoding was not handled correctly when calling a subprocess ([#​8060](python-poetry/poetry#8060)). - Fix an issue where `poetry show --top-level` did not show top level dependencies with extras ([#​8076](python-poetry/poetry#8076)). - Fix an issue where `poetry init` handled projects with `src` layout incorrectly ([#​8218](python-poetry/poetry#8218)). - Fix an issue where Poetry wrote `.pth` files with the wrong encoding ([#​8041](python-poetry/poetry#8041)). - Fix an issue where `poetry install` did not respect the source if the same version of a package has been locked from different sources ([#​8304](python-poetry/poetry#8304)). ##### Docs - Document **official Poetry badge** ([#​8066](python-poetry/poetry#8066)). - Update configuration folder path for macOS ([#​8062](python-poetry/poetry#8062)). - Add a warning about pip ignoring lock files ([#​8117](python-poetry/poetry#8117)). - Clarify the use of the `virtualenvs.in-project` setting. ([#​8126](python-poetry/poetry#8126)). - Change `pre-commit` YAML style to be consistent with pre-commit's own examples ([#​8146](python-poetry/poetry#8146)). - Fix command for listing installed plugins ([#​8200](python-poetry/poetry#8200)). - Mention the `nox-poetry` package ([#​8173](python-poetry/poetry#8173)). - Add an example with a PyPI source in the pyproject.toml file ([#​8171](python-poetry/poetry#8171)). - Use `reference` instead of deprecated `callable` in the scripts example ([#​8211](python-poetry/poetry#8211)). ##### poetry-core ([`1.7.0`](https://github.com/python-poetry/poetry-core/releases/tag/1.7.0)) - Improve performance of marker handling ([#​609](python-poetry/poetry-core#609)). - Allow `|` as a value separator in markers with the operators `in` and `not in` ([#​608](python-poetry/poetry-core#608)). - Put pretty name (instead of normalized name) in metadata ([#​620](python-poetry/poetry-core#620)). - Update list of supported licenses ([#​623](python-poetry/poetry-core#623)). - Fix an issue where PEP 508 dependency specifications with names starting with a digit could not be parsed ([#​607](python-poetry/poetry-core#607)). - Fix an issue where Poetry considered an unrelated `.gitignore` file resulting in an empty wheel ([#​611](python-poetry/poetry-core#611)). ##### poetry-plugin-export ([`^1.5.0`](https://github.com/python-poetry/poetry-plugin-export/releases/tag/1.5.0)) - Fix an issue where markers for dependencies required by an extra were not generated correctly ([#​209](python-poetry/poetry-plugin-export#209)). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi40Mi40IiwidXBkYXRlZEluVmVyIjoiMzYuNTIuMiIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9--> Reviewed-on: https://git.walbeck.it/walbeck-it/docker-python-poetry/pulls/846 Co-authored-by: renovate-bot <bot@walbeck.it> Co-committed-by: renovate-bot <bot@walbeck.it>
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Resolves: #5066
Requires: python-poetry/poetry-core#530, python-poetry/poetry-core#546, python-poetry/poetry-core#547
_merge_dependencies_by_marker
and_handle_any_marker_dependencies
by the more general_resolve_overlapping_markers
IncompatibleConstraintsError
if there are conflicts (due to incompatible version constraints or source types) in the constraints of a packageTested with opencv, which has the most duplicate dependencies with overlapping markers I know about:
Especially the markers with
platform_system
andplatform_machine
, which overlap with most other markers make this example quite complex.TODO: Investigation of the influence on the performance for (more) real-world examples