-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Enable lto=fat #20863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable lto=fat #20863
Conversation
CodSpeed Performance ReportMerging #20863 will degrade performances by 5.28%Comparing Summary
Benchmarks breakdown
|
|
b264e60 to
c2f89bb
Compare
c2f89bb to
e4a4964
Compare
|
I'm not sure why that one instrumented benchmark regresses. But a 10% improvement on our walltime benchmarks is very convincing |
ntBre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me!
Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>
…rable * origin/main: [ty] Add (unused) `inferable` parameter to type property methods (#20865) Run macos tests on macos (#20889) Remove `release` CI job (#20887) [ty] CI: Faster ecosystem analysis (#20886) Remove `strip` from release profile (#20885) [ty] Sync vendored typeshed stubs (#20876) [ty] Add some completion ranking improvements (#20807) Improved error recovery for unclosed strings (including f- and t-strings) (#20848) Enable lto=fat (#20863) [`pyupgrade`] Extend `UP019` to detect `typing_extensions.Text` (`UP019`) (#20825) [`flake8-bugbear`] Omit annotation in preview fix for `B006` (#20877) fix(docs): Fix typo in `RUF015` description (#20873) [ty] Improve and extend tests for instance attributes redeclared in subclasses (#20866) [ty] Ignore slow seeds as a temporary measure (#20870) Remove parentheses around multiple exception types on Python 3.14+ (#20768) Update Black tests (#20794)
…nt-sets * dcreager/non-non-inferable: (174 commits) [ty] Add (unused) `inferable` parameter to type property methods (#20865) Run macos tests on macos (#20889) Remove `release` CI job (#20887) [ty] CI: Faster ecosystem analysis (#20886) Remove `strip` from release profile (#20885) [ty] Sync vendored typeshed stubs (#20876) [ty] Add some completion ranking improvements (#20807) Improved error recovery for unclosed strings (including f- and t-strings) (#20848) Enable lto=fat (#20863) [`pyupgrade`] Extend `UP019` to detect `typing_extensions.Text` (`UP019`) (#20825) [`flake8-bugbear`] Omit annotation in preview fix for `B006` (#20877) fix(docs): Fix typo in `RUF015` description (#20873) [ty] Improve and extend tests for instance attributes redeclared in subclasses (#20866) [ty] Ignore slow seeds as a temporary measure (#20870) use existing method Remove parentheses around multiple exception types on Python 3.14+ (#20768) Update Black tests (#20794) just the api parts [ty] Fix further issues in `super()` inference logic (#20843) [ty] Document when a rule was added (#20859) ...
|
Yeah I think I've slowly become okay with this sort of change. I even recently did the same for ripgrep. I still don't like that we have different compilation settings for profiling versus release, but I haven't been bitten too hard by it yet. |
Summary
This PR switches from thin to fat LTO optimization.
The main motivation is that using lto=fat fixes the binary size increase that we've seen after switching to inventory for ingredient registration. The regression is mainly due to symbols not being removed when using lto=thin, but they're successfully removed when using lto=fat (from 19MB to 17MB).
Using lto=fat also results in a significant performance improvement, which I think is worth it on its own.
Unfortunately, using lto=flat does have the downside that release builds take significantly longer. One of the main motivations for using
lto=thinwhen we switched fromfattothinin #9031 was to improve performance. To mitigate this, I changed the--profilingprofile to only uselto=thin, similar to what we do in uv (where we disable lto entirely). This PR is also likely to make the mypy primer and benchmark jobs slower (or when running mypy primer locally), because of a significant increase in compile time.This setup now mirrors uv's (with the exception that
profilinguselto=fat).I do feel a bit bad about making this change while @BurntSushi is out, because I know he feels the most vocal about this.
Clean build timing:
Fixes #20845
Relevant discussions: