hir_owner_parent optimized to inlined call for non-incremental build#147387
hir_owner_parent optimized to inlined call for non-incremental build#147387rust-bors[bot] merged 1 commit intorust-lang:mainfrom
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
hir_owner_parent optimized to inlined call for non-incremental build
This comment has been minimized.
This comment has been minimized.
|
Requested reviewer is already assigned to this pull request. Please choose another assignee. |
|
@rustbot ready |
|
Finished benchmarking commit (f98c273): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.0%, secondary 0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 469.991s -> 471.779s (0.38%) |
|
You posted several PRs with this same idea. Instead of trying all queries, is there a way to address this structurally? I mean: change a macro in the query system to do this automatically? |
That was the plan if the optimization worked. |
I wonder why it works though.
A drawback in this is that you'll probably need a function pointer, and it won't be possible to replace a query call with an inlined cheap function call. |
acb23dc to
5aae413
Compare
|
r? @cjgillot to confirm that |
Description improved. I have added top of processed query info table for minimal 'avg_ns_norm' in the description. |
|
Reminder, once the PR becomes ready for a review, use |
a5e8e86 to
a5052a0
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Let's benchmark again to make sure the previous run wasn't a fluctuation. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
hir_owner_parent optimized to inlined call for non-incremental build
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (3b635d6): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary 1.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -2.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 475.963s -> 475.656s (-0.06%) |
|
Still a minor improvement. @bors r+ rollup=maybe |
…uwer Rollup of 8 pull requests Successful merges: - #147387 (hir_owner_parent optimized to inlined call for non-incremental build) - #150271 (Move struct placeholder pt2) - #151283 (Suggest ignore returning value inside macro for unused_must_use lint) - #151565 (Rename, clarify, and document code for "erasing" query values) - #149482 (thread::scope: document how join interacts with TLS destructors) - #151827 (Use `Rustc` prefix for `rustc` attrs in `AttributeKind`) - #151833 (Treat unions as 'data types' in attr parsing diagnostics) - #151834 (Update `askama` version to `0.15.4`)
Rollup merge of #147387 - azhogin:azhogin/hir_owner_parent_opt, r=petrochenkov hir_owner_parent optimized to inlined call for non-incremental build Continuation of #146880 and #147232. 'hir_owner_parent' query renamed 'hir_owner_parent_q'. hir_owner_parent inlined function added to optimize performance in case of non-incremental build. 'hir_owner_parent' query has low normalized average execution time (163ns) and good cache_hits (5773) according Daria's processed statistics. 'source_span', for comparison, has avg_ns_norm = 66ns and cache_hits = 11361. Optimization may be profitable for queries with low normalized average execution time (to replace cache lookup into inlined call) and be significant with good cache_hits. | Query | cache_hits | min_ns | max_ns | avg_ns_norm | | ------------- | ------------- | ------------- | ------------- | ------------- | source_span | 11361 | 18 | 2991 | 66 hir_owner_parent | 5773 | 52 | 1773 | 163 is_doc_hidden | 3134 | 47 | 1111 | 285 lookup_deprecation_entry | 13905 | 36 | 6208 | 287 object_lifetime_default | 5840 | 63 | 4688 | 290 upvars_mentioned | 2575 | 75 | 7722 | 322 intrinsic_raw | 21235 | 73 | 3453 | 367 Draft PR to measure performance changes.
Continuation of #146880 and #147232.
'hir_owner_parent' query renamed 'hir_owner_parent_q'. hir_owner_parent inlined function added to optimize performance in case of non-incremental build.
'hir_owner_parent' query has low normalized average execution time (163ns) and good cache_hits (5773) according Daria's processed statistics. 'source_span', for comparison, has avg_ns_norm = 66ns and cache_hits = 11361.
Optimization may be profitable for queries with low normalized average execution time (to replace cache lookup into inlined call) and be significant with good cache_hits.
Draft PR to measure performance changes.