Skip to content

Conversation

@sapphi-red
Copy link
Member

@sapphi-red sapphi-red commented Sep 24, 2025

I haven't read the code yet. I just threw some prompts to the AI while I'm working on a different thing.

@github-actions github-actions bot added A-minifier Area - Minifier C-performance Category - Solution not expected to change functional behavior, only performance labels Sep 24, 2025
Copy link
Member Author

sapphi-red commented Sep 24, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@sapphi-red sapphi-red changed the title perf(minifir): use arena allocated HashMap and HashSet perf(minifier): use arena allocated HashMap and HashSet Sep 24, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Sep 24, 2025

CodSpeed Instrumentation Performance Report

Merging #14079 will not alter performance

Comparing 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset (5ec9201) with main (0a42d7f)1

Summary

✅ 33 untouched
⏩ 4 skipped2

Footnotes

  1. No successful run was found on 09-28-feat-allocator-add-hashset (d50d949) during the generation of this report, so main (0a42d7f) was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@overlookmotel
Copy link
Member

overlookmotel commented Sep 24, 2025

If you do want to add HashSet to oxc_allocator crate (not a bad idea), please could you split that into a separate PR? It'd need the const assertions like in oxc_allocator::HashMap to make sure data in the HashSet is non-Drop, to prevent memory leaks.

HashSet itself can also be wrapped in ManuallyDrop to make it non-Drop too, which may be a speed-up if there's lots of HashSets in minifier (e.g. a big Vec<HashSet<T>>).

@sapphi-red sapphi-red force-pushed the 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset branch from 677ae7c to 612064e Compare September 27, 2025 18:12
@sapphi-red sapphi-red changed the base branch from main to graphite-base/14079 September 28, 2025 16:33
@sapphi-red sapphi-red force-pushed the 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset branch from 612064e to 9386915 Compare September 28, 2025 16:33
@sapphi-red sapphi-red changed the base branch from graphite-base/14079 to 09-28-feat-allocator-add-hashset September 28, 2025 16:33
@sapphi-red
Copy link
Member Author

If you do want to add HashSet to oxc_allocator crate (not a bad idea), please could you split that into a separate PR? It'd need the const assertions like in oxc_allocator::HashMap to make sure data in the HashSet is non-Drop, to prevent memory leaks.

HashSet itself can also be wrapped in ManuallyDrop to make it non-Drop too, which may be a speed-up if there's lots of HashSets in minifier (e.g. a big Vec<HashSet<T>>).

Yes, I was planning to split the PR after I verified the perf improvement. That said, it seems this change does not improve perf for the minifier.

Do you think it's still good to have HashSet and HashMap::from_iter_in or have this change even if there aren't a perf improvement?

@overlookmotel
Copy link
Member

overlookmotel commented Sep 28, 2025

You are the minifier maestro, not me! Up to you...

It might be worthwhile addressing my feedback on #14211 and seeing if that has any effect on benchmarks. If the iterators you create HashMaps / HashSets from don't have a known length, it might.

Generally speaking, allocating in arena is cheaper than heap. And if you have a ton of little HashMaps / HashSets, you'd expect to see an improvement moving to arena, because they don't have to be dropped individually any more.

But...

  1. Growing an allocation in arena can be more costly than on heap, because system allocator can sometimes grow it in place, whereas arena allocator always reallocates - which means copying all the data from old allocation to the new one.

  2. Sometimes std lib can do better than what we can do in user-land, because it can use nightly features. In particular, it uses specialization on iterators (e.g. different code paths for slice and map iterators which have a known length vs other iterators which don't e.g. filter). That's a major factor in the perf of Vec, but I don't know if it applies so much for HashMap.

I don't know the ins and outs of HashMap really, so I don't know how much these 2 factors come into play in the case of HashMap. Maybe the "grow in place" thing doesn't apply because it'd always need to rehash and move entries around anyway when the allocation grows?

Sorry, that's not very helpful!

@overlookmotel
Copy link
Member

Side note... There's an optimization we might be be able to make if any of the iterators you create HashMaps / HashSets from are guaranteed to contain unique elements (e.g. it's the keys of another HashMap). from_iter_in has to assume the iterator might contain duplicate keys, which means for every iteration it's doing a "is key already in the map?" check before inserting, which is redundant.

That'd probably be a fairly minor optimization, but maybe worth it if HashMap usage is heavy.

@sapphi-red sapphi-red force-pushed the 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset branch from 9386915 to 2923f93 Compare October 3, 2025 12:14
@sapphi-red sapphi-red force-pushed the 09-28-feat-allocator-add-hashset branch from b32ba2a to feeba46 Compare October 3, 2025 12:14
@sapphi-red sapphi-red force-pushed the 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset branch from 2923f93 to 5ec9201 Compare October 3, 2025 12:30
@sapphi-red sapphi-red force-pushed the 09-28-feat-allocator-add-hashset branch from feeba46 to d50d949 Compare October 3, 2025 12:30
@sapphi-red
Copy link
Member Author

Generally speaking, allocating in arena is cheaper than heap. And if you have a ton of little HashMaps / HashSets, you'd expect to see an improvement moving to arena, because they don't have to be dropped individually any more.

I see. These HashSets are created at most (N + 1) * M times (N is the number of classes, M is the number of iterations). So I think there's little upside to use the arena allocated HashSets. I'll leave these for now, unless I find a case where the improvement is observable.

@sapphi-red sapphi-red closed this Oct 3, 2025
@sapphi-red sapphi-red deleted the 09-24-perf_minifir_use_arena_allocated_hashmap_and_hashset branch October 3, 2025 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-minifier Area - Minifier C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants