Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve precision of mypy performance tracking #14358

Open
JukkaL opened this issue Dec 28, 2022 · 0 comments
Open

Improve precision of mypy performance tracking #14358

JukkaL opened this issue Dec 28, 2022 · 0 comments
Labels
topic-developer Issues relevant to mypy developers

Comments

@JukkaL
Copy link
Collaborator

JukkaL commented Dec 28, 2022

We automatically track changes in mypy performance over time (#14187). Currently we can detect changes of at least 1.5% pretty reliably, but smaller changes are hard to detect. #14187 has some relevant discussion, such as this comment: #14187 (comment)

I'd estimate that a cumulative performance regression of around 15% in 2022 was due to changes that were below the 1.5% noise floor. Getting the detection threshold down to 0.5% or below could be quite helpful in finding and fixing regressions.

I looked at individual measurements, and it seems possible that measurements slowly fluctuate over time. I'm not entirely sure what might be causing this. Just increasing the number of iterations we measure probably won't help much, since different batches of runs will cluster around different averages.

Here are some things that could help:

  1. Interleave executions of current/previous builds and measure the delta. Instead of only collecting absolute performance values, interleave runs using the previous commit and the target commit and calculate the average delta. If performance gradually fluctuates over time, this should help.
  2. Further tweak the configuration of the runner machine for stability. See Configure benchmark machine for maximal stability scala/scala-dev#338 as suggested by @A5rocks.
  3. Collect samples over a long period of time (say, 1 sample every hour over 12 hours).
  4. Collect detailed profiling data for each commit and also highlight differences in the time spent in different parts of the mypy implementation. If a single function gets 2x slower, it could be easy to detect this way, even if the change in overall performance is well below the noise floor. This could be quite noisy due to renaming/splitting functions, etc.

I'm going to start by investigating if the idea 1 seems feasible.

@JukkaL JukkaL added the topic-developer Issues relevant to mypy developers label Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-developer Issues relevant to mypy developers
Projects
None yet
Development

No branches or pull requests

1 participant