-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New resolver: Build automated testing to check for acceptable performance #8664
Comments
For SkyPortal, we use pip to verify that all required Python packages are present. This takes about 2 seconds on the old pip, and 20 with the resolver enabled. Will it be possible to revert to the old behavior in the future (i.e., switch off the resolver?).
Perhaps it would be possible to do a quick "first check" to see if all packages just happen to satisfy requirements, and if they don't to only then enable the resolver? Our
|
I believe both resolvers already do a scan to check whether packages are already satisfied. The problem is the new resolver is slower to determine what are needed to satisfy the dependencies, since the checks are much more involved than the naive legacy logic. In your particular use case, if you always list all requirements (instead of relying on pip to discover transient dependencies), you can use the |
I reported a similar performance issue in #8675. |
i have simplifed the test case a bit and measure only the pip running time. download times are not included as prefer binary is used. requirements:
Please find the 2 log files in this gist: https://gist.github.com/minusf/bd0edfeaf5975980917f2d0792677b52
|
@uranusjr I don't see the |
@stefanv It's the 4th option in $ pip install --help
Usage:
[snipped for brevity]
Description:
[snipped for brevity]
Install Options:
--no-clean Don't clean up build directories.
-r, --requirement <file> Install from the given requirements file. This option can be used multiple times.
-c, --constraint <file> Constrain versions using the given constraints file. This option can be used multiple times.
--no-deps Don't install package dependencies.
[snipped for brevity] |
Thanks @pradyunsg! But I see now that this would cause problems too, since it would require us to list all dependencies in our requirements.txt file. |
We also tried to enable the new resolver (and actually fixed a number of dependency conflicts by using it, so that's good!) But the performance is abysmal in the usual developer case where after switching git branches I'll just run Compare the runtime of the new resolver:
with the runtime of the old resolver:
Granted, this is for 211 installed packages. Many of those are internal and rely on other internal and PyPI packages, so the dependency graph is far from trivial. But a slowdown by a factor of ~100 appears a bit too much. Interestingly, the new resolver takes 1-2 seconds to check each already installed package and even checks many packages multiple times:
|
I benchmarked on a medium-sized Django project and found the slowdown was from 1.6 seconds to 41 seconds (again when all packages are already installed locally at the correct versions):
I profiled the project with py-spy:
This resulted in a speedscope file - see attached. It can be used at https://www.speedscope.app/ to investigate the profile. pip-install-redacted.speedscope.zip Most of the time - 93,547 out of 99,225 frames - was unsurprisingly under Tracing it down I noticed there are a lot of invocations of Indeed when I turned off my internet connection and tried again, using
These requests don't seem necessary as the resolution continues just fine (although I didn't wait until the end)... I hope this can help. |
I was just told about
|
Pip version: 20.2.3 With a fully frozen requirements.txt (all packages specified, all with Classic resolver:
2020-resolver:
2020-resolver with
That is 2 seconds for the classic resolver, 246 seconds for 2020-resolver (123x slower), 94 seconds for 2020-resolver with I really like what the 2020-resolver does. I'd be happy to take the performance penalty in CI, in a new virtual env, to ensure correctness. But for local development, where users may be expected to run I don't want |
@antoncohen Can I just check I understand your example here? You have a Assuming I haven't misunderstood, there's something odd going on here. If we have a requirement If my reasoning above is correct, then we should never even see candidates that don't get installed. As everything is pre-installed, we should pick the installed version over a new install, and we can get metadata by a simple filesystem lookup. So there's no need to go to PyPI (or any other index) at all. It's possible there's a genuine bug here, and the resolver is not constraining the candidates based on the root set early enough. To prove that, we'd likely need to instrument a run of the problem case and see exactly what order the code is doing things. That would mean getting a reproducible example, though. If your test case is genuinely made up of fully pinned requirements ( If I've misunderstood how your example is set up, my analysis above is wrong. In that case don't bother with the detail data. But I would be glad to know what I didn't understand about your test case 🙂 |
I know you weren't asking, but this is the case I tested. I don't think it's 'completely artificial' - I quite often run the command when switching branches or pulling latest changes on a project just in case dependencies changed. |
But the example has everything, including dependencies, pinned. So "just in case dependencies changed" doesn't apply. It's a situation where we know absolutely, up front, that pip won't install anything. Or are you expecting to get Unless I'm misunderstanding 🙂 Anyway, the main point here is that we really need a reproducible test case. At the moment, we don't have one, so I'm mostly just trying to get enough information to construct one (that can be run with all local files, so we can avoid network/cache effects). |
He ist taking about the dependencies listed in the I do that about 20 times a day. And sometimes I forget to run it with the result that our application does not come up. For this reason some colleagues like to add a hook to git to run |
But again, the example I was responding to said "in a virtual environment where all the requirements are already satisfied". So again this is a different situation. I don't want to dismiss your use cases. They are just harder to analyze, because if pip might find it needs to install things, that introduces extra work the resolver has to do. The significant advantage of @antoncohen's case is that it doesn't have those complexities, making it easier to analyze. The disadvantage is that it is (or at least seems to be) more unrealistic than the sorts of case you're talking about. |
This isn't exactly True right? I haven't dug into the new code at all yet, but presumably you can have an sdist and multiple wheels that all have the same version and different dependencies? Also in PEP 440, |
@dstufft Yes, it's not entirely accurate. I'm assuming no local versions are involved. And the finder will give back a list of compatible files for the given version, but we should pick just one to hand to the resolver, based on things like Apologies, I'm doing this from memory at the moment, it's a few weeks since I've gone into the code in depth. My main interest at the moment is pinning down the reported behaviour well enough to replicate it locally. Once I've got that, I'd intend to fire a test case at an instrumented version of the code, and really dig into precisely what's happening. To get the sort of slowdowns being reported suggests that the resolver is backtracking badly, or otherwise doing a lot of unnecessary work. If the situation is as described, that may be a bug - because the described situation is so constrained that there's nothing to backtrack to. So either we have a bug or the description is failing to make clear where the source of additional options is coming from. Hopefully someone can come up with enough detail that we can establish which is the case here. |
@pfmoore, thanks for the response!
In my initial test 6 of the 200+ requirements were directory tarballs, I consider them frozen because they are referenced by hash and don't change. I removed them so truly 100% of requirements.txt is fully pinned like
The output is all "Requirement already satisfied", and every line of "Requirement already satisfied" takes about a second.
Everyone has different use cases. But in my experience, dealing with applications that get deployed to production, this is the 99% use case. Every production application that uses requirements.txt will usually have a fully pinned and resolved requirements.txt. Usually that requirements.txt is generated from looser constraints with something like In local development there is almost always an existing virtual environment with dependencies installed. In CI often times there will be fresh installs. But when people try to optimize CI build times they might end up caching virtual envs or layering images.
I can't provide the exact requirements.txt because it includes private packages. But I can construct one. I searched Google for One important note, my testing that takes 4 minutes has some packages that come from a private PyPI repo. I noticed that even if no packages come from the private PyPI, having This gist contains the requirements.txt and the https://gist.github.com/antoncohen/ace9499dc881fc472873c4c0da97663c Here are the timings I got: No extra-index-url:
With extra-index-url:
Classic resolver:
Our actual requirements.txt is twice as large, and our private PyPI is probably slower than Alibaba Cloud. But hopefully this example where it takes over a minute will be helpful. |
Same here for our internal project.
That's good to know because we use an internal devpi install to take some load from pypi.org and to provide our internal packages. pip of course checks both indexes as it seems - not sure if it is possible to disable pypi lookups?!
Same here. Maybe I could provide it but not the packages... But I constructed a sufficiently large requirements file by taking an older project, dropping private packages and adding some from pypi. For reproducability I created a Dockerfile to run this independent of the local setup. You can find Dockerfile and requirements.txt here: https://gist.github.com/tlandschoff-scale/83a95661e40bf4b51c32c0f990e15a37 Run time here:
compared with the new resolver:
Out of curiosity I added the extra index from Alibaba and did an extra run:
|
Thanks @antoncohen for taking the time to provide a reproducer and for the explanation of your use case. Please understand, I'm not dismissing your situation at all, my only thought was that it may be sufficiently specialised that if we get into a trade-off where we have to make something else slower to speed this up, we will need to consider the question of what is the common case we should optimise (and that's always very hard to determine, as we get very conflicting reports of what counts as the "common case" from people with radically different workflows). I'll do some investigation of your reproducer over the next few days and see what I can find. |
requirements.txt Setup, on Python 3.8.5:
Again testing, with old resolver:
With new resolver:
|
Spending some more time to debug this... pip's new resolver is hitting the network even when the currently installed version does satisfy the version requested. Further, it's also hitting the same index page (i.e. That's 100% a genuine bug, and I'll file a new issue for tracking that. |
@uranusjr I see thanks for the explanation. Do you think the same reasoning could explain a catastrophic degradation of Is it the plan to make the new resolver the default for python 2 too ? |
I’m not sure, TBH I don’t really personally use |
@uranusjr besides the python 2 issue, I see the new resolver has a visible performance impact on |
That would be awesome. Are you on Zulip? It should be easiest to DM there since all of the people working on the resolver can be reached. |
I definitely want us to know more and look into the Per our Python 2 support policy, pip 20.3 users who are using Python 2 and who have trouble with the new resolver can choose to switch to the old resolver behavior using the flag |
@sbidoul, may I take a look at the reproducer for In addition, I'll be profiling pip's basic functionalities (comparing when legacy and new resolver used) in the next few days. It's for a course at university (scientific communication) so there'll be quite some time and human resource to take a deeper look—is there anything anyone here wants us to focus on, otherwise we'll just go for {install,download,wheel} of the combination of the most popular packages? |
@McSinyx I sent you the reproducer too. |
Update:
|
Based on the benchmarking and progress from the past several weeks I believe pip's performance with the new resolver is now fine to ship as default. Moving to "needs triage" so we can decide whether to close, or to refactor this issue into something more useful for the next phase. |
My friends and I have just run benchmark and the result agrees with this 100%: as of 20.3.0b1, there's virtually no difference in performance between the two resolver. Here is our poster—it's far from perfect and we would love to have feedback on our work since it's the first time we do a scientific poster. Please feel more than free to use it to promote the new resolver roll-out process! |
I have posted an example where the new resolver in |
I don't really agree with your summary. First, your requirement sets are tiny (9 at most), so it's hard to draw conclusion on larger ones as effort may increase superlinearly. Then, Figure 1. If you disable the download cache, you are including download times in your measurements, and this will dominate execution times. Finally, Figure 2. I have found no way to use the old resolver in 20.3.0b1, so I think you are comparing apples with basically the same apples. It's no surprise to me you don't see a difference. |
Agreed, the use case I examined is different from your use case: one is what people does on their work stations (incremental installations) and one is recreating an environment. I don't think the poster is anyhow complete but it might give an end-user an idea of what to expect. I suppose for really long requirement sets GH-9082 might be one of the reason for the poorer performance.
Yes, but apparently 20.2.4 did even more downloads that make it a lot slower in many cases: while in 20.3.0 it's almost the graph if the identity function, 20.2.4 is obviously above it: (I'm sorry the the graph is not very straightforwardly annotated, it should be interpreted as new resolver performance in 20.2.4 and 20.3.0b1 compared to low resolver performance (which doesn't really change in the last many months.)
IIRC you can use |
Just a note as it took me a while to find, you need to install pip version |
Note: I was urged to comment here about our experience from twitter. We (prefect) are a bit late on testing the new resolver (only getting around to it with the 20.3 release). We're finding that install times are now in the 20+ min range (I've actually never had one finish), previously this was at most a minute or two. The issue here seems to be in the large search space (prefect has loads of optional dependencies, for CI and some docker images we install all of them) coupled with backtracking. I enabled verbose logs to try to figure out what the offending package(s) were but wasn't able to make much sense of them. I'm seeing a lot of retries for some dependencies with different versions of I've uploaded verbose logs from one run here (killed after several minutes of backtracking). If people want to try this themselves, you can run:
Any advice here would be helpful - for now we're pinning pip to 20.2.4, but we'd like to upgrade once we've figured out a solution to the above. Happy to provide more logs or try out suggestions as needed. Thanks for all y'all do on pip and pypa! |
To keep things a bit easier to manage: we're going to have this issue (#8664) be about building automated testing to check for acceptable performance, and we've made #9187 the issue to "centralize incoming reports of situations that seemingly run for a long time" - including the question in #9187 (comment) :
Donald moved a relevant comment from here to there #9187 (comment) . Sorry for accidentally misdirecting you @jcrist! |
I just wanted to close the loop here from the SkyPortal side. When the new beta resolver was made available, it was unworkable for us. @brainwane reached out, filed this issue, and within a few months our problems were addressed. A huge shoutout to the team for soliciting community feedback, taking it seriously, and doing such dedicated work to making pip better. I know that many (most?) of you are volunteers, and your efforts are so appreciated. 🙏 |
Our new dependency resolver may make pip a bit slower than it used to be.
Therefore I believe we need to pull together some extremely rough speed tests and decide what level of speed is acceptable, then build some automated testing to check whether we are meeting those marks.
I just ran a few local tests (on a not-particularly-souped-up laptop) to do a side-by-side comparison:
Or, in 2 different virtualenvs:
These numbers will add up with more complicated processes, dealing with lots of packages at a time.
Related to #6536 and #988.
Edit by @brainwane: As of November 2020 we have defined some speed goals and the new resolver has acceptable performance, so I've switched this issue to be about building automated testing to ensure that we continue to meet our goals in the future.
Edit by @uranusjr: Some explanation for people landing here. The new resolver is generally slower because it checks the dependency between packages more rigorously, and tries to find alternative solutions when dependency specifications do not meet. The legacy resolver, on the other hand, just picks the one specification it likes best without verifying, which of course is faster but also irresponsible.
Feel free to post examples here if the new resolver runs slowly for your project. We are very interested in reviewing all of them to identify possible improvements. When doing so, however, please make sure to also include the
pip install
output, not just yourrequirements.txt
. The output is important for us to identify what pip is spending time for, and suggest workarounds if possible.The text was updated successfully, but these errors were encountered: