-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
perf(nm): improve hoist traverse logic and memory usage #6787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
workflow failure has nothing to do with this change.(related: #6776) |
The benchmarks looks strange. So |
I don't see any significant improvements with this PR on my local machine, neither in performance, nor in memory usage. The usual drift in performance and memory usage seems to be higher than the impact of the change. I'm reluctant to merge this PR, unless I see the real proof that it improves anything with proper benchmarks that can be reproduced. |
You are changing hoisting algorithm which runs a small fraction of install time, but you are measuring the whole install time. To see how much is the hoisting time fraction you can use |
oops.. it looks strange to me too and I can see my hyperfine cli is incorrect. I will look into it, thanks |
Thank you for your kind explanation. You can ignore my previous benchmarks as they were inaccurate. My bad for the confusion. I measured the benchmarks again, and there was no measurable significant improvement. Benchmark for hoist timeyarn-stable
yarn-after
|
@WooWan Thank you! This is what I see as well. |
I understand, since there's no clear performance improvement. I'll close the PR for now. Thanks |
What's the problem this PR addresses?
This PR improves the performance of getSortedRegularDependencies by reusing a single
seenDeps Set
across recursive calls instead of creating new ones, and using Map entries for direct key access on large dependency trees with many nested packages.How did you fix it?
The previous code traversed nodes redundantly by recreating seenDeps for every dependency, causing repeated processing of already visited nodes. By declaring seenDeps at the function level, we eliminate unnecessary traversals and reduce memory allocations while maintaining the same results as before.
refers: #1717 (comment)
Perfomance improvement
This is just a local benchmark for reference
When there is cache, 7% faster.
Memory usage
I used
time
for benchmark.System Resource Metrics
Checklist