-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Print salsa memory usage totals in mypy primer CI runs #18973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6c11216 to
b99c7b9
Compare
|
|
The code looks good to me but I think the numbers aren't very meaningful anymore. Could we use some form of bucketing (based on magnitude, otherwise based on their size) and then output the result in MB or GB (2 digits?). Maybe something like this def scalable_memory_round(mb_value):
"""
Memory rounding that scales properly into GB range
"""
if mb_value == 0:
return 0
elif mb_value < 1:
return 1 # 1MB minimum
elif mb_value < 5:
return round(mb_value) # 1MB precision
elif mb_value < 50:
return round(mb_value / 5) * 5 # 5MB precision
elif mb_value < 200:
return round(mb_value / 10) * 10 # 10MB precision
elif mb_value < 1000:
return round(mb_value / 25) * 25 # 25MB precision
else: # 1-5GB range
return round(mb_value / 100) * 100 # 100MB precision
# Test with GB-range values:
test_values = [
# Small values
0.5, 2.3, 8.7, 15.2, 47.8, 156.3,
# Medium values
300, 750, 1500,
# GB range values
1200, 1850, 2100, 3500, 5200, 8750, 12000, 25000
]
print("Original → Rounded")
for val in test_values:
rounded = scalable_memory_round(val)
gb_orig = val / 1000
gb_rounded = rounded / 1000
print(f"{val:6.1f}MB ({gb_orig:4.1f}GB) → {rounded:6.0f}MB ({gb_rounded:4.1f}GB)") |
I guess the generalization of this idea would be logarithmic rounding: In general, all of these approaches can still lead to noisy diffs, though. Even if the memory usage increases just a tiny bit, we can always fall into the next rounding bucket. So what we would really want to do instead of diffing the rounded results, would be to build the actual difference and compare that to the total. And then warn if |new - old|/new exceeds some threshold, in both directions. It's not possible to do this using mypy_primer, but it's plausible that we could build this into ecosystem-analyzer. We talked about runtime diffs in the past as well. I think it's perfectly fine to start with this, though. |
Yes, that's true. Our goal isn't to avoid noise entirely. We primarily aim to find an approach where projects keep falling into the same bucket unless there's a real regression. I also reached out to the codspeed team to understand if they have a way to surface custom benchmark metrics. Unfortunately, they don't but I think that would be the ideal solution (because it also allows us to track the value over time) |
We could use a fractional base and round-to-nearest to have a percentage threshold in either direction, but I'm not sure what a good percentage is. Maybe 5%? It has the same bucketing problem but the numbers should be more useful. |
I suggest picking something and then seeing how noisy it turns out to be in practice. This is something we can easily iterate on. |
6282643 to
c9b85eb
Compare
c9b85eb to
4613ed8
Compare
Summary
Print the new salsa memory usage dumps in mypy primer CI runs to help us catch memory regressions. The numbers are rounded to the nearest 100 to avoid overly sensitive diffs. I don't really have a good sense for whether this is still too sensitive or maybe not sensitive enough. It also likely depends on the scale of the numbers, but rounding to the next power-of-2 felt like losing too much precision.
I'm also not sure about the exact numbers we want to expose, or if the current breakdown is too detailed. @MichaReiser mentioned cycle counts?