Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(semantic): avoid counting nodes #5703

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

overlookmotel
Copy link
Collaborator

@overlookmotel overlookmotel commented Sep 11, 2024

Instead of visiting AST to count AST nodes in SemanticBuilder, just make high estimates of counts for nodes, scopes, symbols and references, based on source text length. These estimates are the maximum number of items theoretically possible for a given source text length. Almost always they'll be vast over-estimates, but should never be an under-estimate.

This ensures AstNodes, ScopeTree and SymbolTable's allocations will not grow during their construction, which is a major performance hit, but avoids the complete AST traversal we were using previously to get accurate counts.

This "estimate high" approach is a viable strategy on 64-bit systems with virtual memory, where:

  1. The memory space is so large that it's almost impossible to exhaust.
  2. Allocating memory only consumes virtual memory, not physical memory.

So use this faster "estimate high" method as standard, and the slower visitor-based counter on 32-bit systems and WASM.

Note: I tried using shrink_to_fit on the Vecs once semantic run is complete, to free up the excess memory. But this caused the Vecs to all reallocate, which negated the performance gains. It was an artefact of the custom global allocator we use in our benchmarking setup. Shrinking allocations does not cause reallocation.

@github-actions github-actions bot added the A-semantic Area - Semantic label Sep 11, 2024
Copy link

graphite-app bot commented Sep 11, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “0-merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@Boshen Boshen force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 23cfb0f to 6502986 Compare September 11, 2024 13:09
@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 6502986 to 228de14 Compare September 11, 2024 17:11
@github-actions github-actions bot added A-minifier Area - Minifier A-transformer Area - Transformer / Transpiler labels Sep 11, 2024
Copy link

codspeed-hq bot commented Sep 11, 2024

CodSpeed Performance Report

Merging #5703 will improve performances by 11.24%

Comparing 09-11-perf_semantic_avoid_counting_nodes (687081d) with main (8ff013a)

Summary

⚡ 10 improvements
✅ 19 untouched benchmarks

Benchmarks breakdown

Benchmark main 09-11-perf_semantic_avoid_counting_nodes Change
codegen[checker.ts] 20.6 ms 18.7 ms +9.97%
minifier[antd.js] 365.2 ms 349.3 ms +4.54%
minifier[typescript.js] 655 ms 629.9 ms +3.99%
semantic[antd.js] 129 ms 116 ms +11.24%
semantic[cal.com.tsx] 46.9 ms 43.9 ms +6.78%
semantic[checker.ts] 76.6 ms 70.1 ms +9.24%
semantic[pdf.mjs] 20.9 ms 19.7 ms +5.93%
transformer[antd.js] 56.3 ms 51.4 ms +9.54%
transformer[checker.ts] 21.5 ms 20.6 ms +4.21%
transformer[pdf.mjs] 8 ms 7.3 ms +10.82%

@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 228de14 to 19ab835 Compare September 11, 2024 17:44
@overlookmotel overlookmotel changed the base branch from main to 09-11-refactor_semantic_move_counts_code_into_counter_module September 11, 2024 17:44
@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 1e9d7ef to 527c80f Compare September 11, 2024 19:24
@overlookmotel overlookmotel changed the base branch from 09-11-refactor_semantic_move_counts_code_into_counter_module to 09-11-feat_index_add_indexvec_shrink_to_ September 11, 2024 19:24
@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 527c80f to df138a7 Compare September 11, 2024 19:37
@overlookmotel
Copy link
Collaborator Author

overlookmotel commented Sep 11, 2024

TODO: Allow passing existing counts into SemanticBuilder, instead of basing counts on source text length.

This is necessary when re-running semantic on an AST which has been transformed/minified. Nodes / scopes / symbols / references counts may have increased in transformation, but the source text has not changed, so estimating based on source text length will be inaccurate. Additionally, the actual counts will be accurate, unlike estimating from source text length.

At present all tests pass but this is an accident because the over-estimates are so large that they obscure this problem. It would be possible to produce panics with the right input.

@DonIsaac DonIsaac force-pushed the 09-11-feat_index_add_indexvec_shrink_to_ branch 2 times, most recently from 3fa7e0f to ca8ea3e Compare September 11, 2024 20:48
@DonIsaac DonIsaac force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 033c4d1 to 7702452 Compare September 11, 2024 20:48
@overlookmotel overlookmotel changed the base branch from 09-11-feat_index_add_indexvec_shrink_to_ to graphite-base/5703 September 11, 2024 22:58
@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 7702452 to 843169d Compare September 11, 2024 23:01
@overlookmotel overlookmotel changed the base branch from graphite-base/5703 to main September 11, 2024 23:02
@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch 2 times, most recently from b9df2d1 to 51ffc54 Compare September 12, 2024 00:35
@overlookmotel
Copy link
Collaborator Author

overlookmotel commented Sep 12, 2024

As an experiment, I hard-coded the exact counts into SemanticBuilder for our benchmark fixtures, so it doesn't over-allocate.

This has little effect on most of semantic benchmarks, but it does noticeably speed things up for semantic[RadixUIAdoptionSection.jsx] (+4%).

So, while over-allocating is not too bad, for small files, it's not as good as accurate counts.

The over-allocating version is massively over-allocating. For RadixUIAdoptionSection.jsx:

Actual Estimated
nodes 392 5041
scopes 2 1261
symbols 11 1259
references 28 1261

My best guess of the reason for the slowdown of over-allocating on small files is that the much larger estimated counts push the allocations up a size class in the system allocator. I believe system allocator serves small allocations from a thread-local pool, rather than taking from the main cross-thread pool. The former is faster as it doesn't involve any synchronisation. This would only make a difference for small files - large files probably produce "large class" allocations either way.

I feel that small files are the most representative of the code Oxc will be processing in the "real world", so the RadixUIAdoptionSection.jsx benchmark is probably the most important one.

Conclusion: Getting accurate counts in the parser will be a better solution if it doesn't cost much in the parser. According to @DavidHancu on Discord cost of compiling counts in parser is about -0.3%. That's pretty cheap.

@overlookmotel overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from be62ab2 to 6ce1ea9 Compare September 12, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-minifier Area - Minifier A-semantic Area - Semantic A-transformer Area - Transformer / Transpiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant