-
-
Notifications
You must be signed in to change notification settings - Fork 722
perf(codegen): reduce memory allocations in generate_line_offset_tables
#13054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(codegen): reduce memory allocations in generate_line_offset_tables
#13054
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
CodSpeed Instrumentation Performance ReportMerging #13054 will not alter performanceComparing Summary
Footnotes |
Merge activity
|
13cb5f5 to
ab685bd
Compare
…fset_tables` (#13056) #13054 added a nice optimization to `SourcemapBuilder`. During generation of line/offset tables, it reuses a single `Vec` for column indexes for each line, rather than creating a new `Vec` on each turn of the inner loop. This reduces the number of times that `Vec` may have to grow as column indexes get added to it. Take this optimization a step further by re-using the same `Vec` across *all* lines. `columns` `Vec` is not consumed on each line, but each time the contents are copied into a boxed slice - except when reaching EOF, where we can consume `columns`, as its work is done. This memory-copying was likely happening anyway, as `Vec<u32>` -> `Box<[u32]>` conversion has to drop the spare capacity of the `Vec`, which will likely cause a reallocation. Also, avoid using iterators to create the boxed slices. `Vec::clone` followed by `Vec::into_boxed_slice` is a bit more explicit and so may help compiler to see that it only needs to allocate exactly `columns.len()` slots for the `Box<[u32]>`. Note: I also tried `columns.drain(..).collect()` instead of `columns.clone().into_boxed_slice()` + `columns.clear()`. But it looks like the `Drain` abstraction doesn't get completely removed by compiler. https://godbolt.org/z/Trv47j4hP So I *think* `into_boxed_slice` is probably preferable.
No description provided.