-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: use faster allocators #2874
Conversation
I've tested this on M4 PRO, whith the
It was inconclusive. running it twice give both as fastest. We should test it on a linux musl build. |
@ruben-arts and I stumbled over #2878 while testing this PR |
Huge improvement after building correctly:
|
Lucky find I guess! |
A few things that would be good to figure out:
|
I took the approach from uv where the explicitly gate this to reduce the compile time. It introduces a number of heavy dependencies. With my simple benchmark it speeds things up on Windows, but just marginally. Unless the binary size increases a lot I say we leave it as is. |
I compared artifacts produced by different CI pipelines and they are pretty similar in size. But I am not sure if the artifact produces is using the feature. |
Only the artifact produced by |
Just ran the same benchmark on my windows machine and compared it also to a few older versions. The results are actually quite insane..
I ran this on the holoviews repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be interesting if macOS also improves.
Independently of that, I think we can bring this in.
Let's remember to compare binary sizes after the next release.
With the caveat that I got a crazy outlier on the run before this it doesn't seem to do much on macos
|
I stole this approach from uv. Apparently using specialized allocators can have a significant performance benefit. However, it also adds additional compilation overhead that we dont really need when compiling locally therefor we use this clever trick with features to enable the allocators conditionally.
You can build a "performance" build with
cargo build --release --features performance
. This should allow you to benchmark this locally.