-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[experiment, do not merge!] rewrite the DenseBitSet structure to only use 1 word on the stack #141325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This commit modifies DenseBitSet so that it only uses one word on the stack instead of 4 words as before, allowing for faster clones. The downside is that it may at most store 63 elements on the stack as aposed to 128 for the previous implementation.
Some changes occurred in coverage instrumentation. cc @Zalathar Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt These commits modify the If this was unintentional then you should revert the changes before this PR is merged. |
(sorry for the unintended pings, everyone) cc @nnethercote this is the bitset work I was telling you about. It did look interesting in my tests, ~= 0.5% wins on a lot of primary benchmarks. let's see if a try build works: @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
[experiment, do not merge!] rewrite the DenseBitSet structure to only use 1 word on the stack Modify DenseBitSet in the rustc_index crate so that it only uses one word on the stack instead of 4 words as before, allowing for faster clones. The downside is that it may at most store 63 elements on the stack as aposed to 128 for the previous implementation. r? lqd This is experimental so far and I mostly want a perf run to measure the performance.
⌛ Trying commit 3ff2c88 with merge 693cccaceb81d959dbf823da027464c8655b8b57... |
This comment has been minimized.
This comment has been minimized.
Looks interesting. Unfortunately being a single commit makes it incredibly hard to review, especially with so much code being moved into |
Yeah this is just for the perf run I think. The branch I was looking at (now unfortunately gone) had many smaller commits that were easier to review, eg the last 20 of master...tage64:rust:replace_predecessors |
I squashed all commits into one to make rebasing on upstream easier. Maybe that was not necessary but as you said, this is just for experiments. The original commit history is still there in tage64/rust@thin_bit_set_old. |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (693ccca): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -1.1%, secondary -1.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -0.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%, secondary 0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 775.275s -> 774.785s (-0.06%) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
sizes are different The new implementation of DenseBitSet doesn't store the exact domain size, so of course the hash values for identical sets with different domain sizes may be equal.
The job Click to see the possible cause of the failure (guessed by this bot)
|
Modify DenseBitSet in the rustc_index crate so that it only uses one word on the stack instead of 4 words as before, allowing for faster clones. The downside is that it may at most store 63 elements on the stack as aposed to 128 for the previous implementation.
r? lqd
This is experimental so far and I mostly want a perf run to measure
the performance.