Skip to content

Conversation

Noratrieb
Copy link
Member

Because it is not really helpful to have downstream instantiations that we won't inline anyways (*)

* This is not quite true when #[inline] is being used as effectively #[likely_unused]

Let's check the performance impact of just doing this without any more thinking about anything

Because it is not really helpful to have downstream instantiations that
we won't inline anyways (`*`)

`*` This is not quite true when `#[inline]` is being used as effectively
`#[likely_unused]`
@rustbot
Copy link
Collaborator

rustbot commented Oct 4, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 4, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 4, 2025

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Noratrieb
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 4, 2025
…=<try>

Avoid `LocalCopy` instantiation for `#[inline]` on `-Copt-level=0`
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 4, 2025
@Noratrieb Noratrieb marked this pull request as draft October 4, 2025 20:30
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 4, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 4, 2025

☀️ Try build successful (CI)
Build commit: 85bd5e8 (85bd5e831a7047f663636aa92624a501051dbb74, parent: 99ca0ae87ba5571acee116ea83d1f9e88a7bf8d8)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (85bd5e8): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
23.3% [0.1%, 897.4%] 48
Regressions ❌
(secondary)
2.7% [0.1%, 32.6%] 43
Improvements ✅
(primary)
-1.4% [-4.6%, -0.2%] 20
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 2
All ❌✅ (primary) 16.0% [-4.6%, 897.4%] 68

Max RSS (memory usage)

Results (primary 2.0%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.4% [0.8%, 15.6%] 24
Regressions ❌
(secondary)
5.8% [1.4%, 12.4%] 11
Improvements ✅
(primary)
-2.7% [-6.8%, -0.8%] 12
Improvements ✅
(secondary)
-4.4% [-6.9%, -1.8%] 14
All ❌✅ (primary) 2.0% [-6.8%, 15.6%] 36

Cycles

Results (primary 52.9%, secondary 5.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
61.9% [2.3%, 923.3%] 19
Regressions ❌
(secondary)
8.2% [2.0%, 26.5%] 14
Improvements ✅
(primary)
-4.0% [-4.8%, -3.3%] 3
Improvements ✅
(secondary)
-3.7% [-6.6%, -1.1%] 5
All ❌✅ (primary) 52.9% [-4.8%, 923.3%] 22

Binary size

Results (primary 7.6%, secondary 14.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
12.6% [1.2%, 67.5%] 55
Regressions ❌
(secondary)
15.3% [1.3%, 78.2%] 27
Improvements ✅
(primary)
-1.9% [-9.1%, -0.2%] 29
Improvements ✅
(secondary)
-0.2% [-0.4%, -0.0%] 2
All ❌✅ (primary) 7.6% [-9.1%, 67.5%] 84

Bootstrap: 471.44s -> 471.725s (0.06%)
Artifact size: 388.29 MiB -> 387.96 MiB (-0.09%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 4, 2025
@saethlin
Copy link
Member

saethlin commented Oct 5, 2025

The change improves full builds of binaries. We don't have very many of those in the benchmark suite.

The 897% change is an incr-patched benchmark, you probably angered the CGU merging gods and need to do the strategy I explained in this PR: #145910. There are some other libraries that regress, and those are pretty easy to hand-wave away as needing to codegen more items.

I think the rest of the changes which seem like relatively random perturbations to library build times, are just because we use instantiation modes as an unprincipled caching system that crate authors are not actively trying to take advantage of, so if you perturb the rules for what gets LocalCopy, some crates benefit and some regress.

@Noratrieb
Copy link
Member Author

We also need to address how this likely undoes some of the wins from #117727 (comment)

@bjorn3
Copy link
Member

bjorn3 commented Oct 5, 2025

stdarch depends on #[inline] skipping local codegen on wasm. Otherwise the wasm runtime would need to support simd128 and relaxed-simd instructions to run any wasm modules where the respective simd intrinsics aren't GCed.

@saethlin
Copy link
Member

saethlin commented Oct 5, 2025

stdarch should be using an attribute that's designed for that purpose.

@Noratrieb
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 13, 2025
…=<try>

Avoid `LocalCopy` instantiation for `#[inline]` on `-Copt-level=0`
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 13, 2025
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-20-1 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
---- [ui] tests/ui/stats/macro-stats.rs stdout ----
Saved the actual stderr to `/checkout/obj/build/aarch64-unknown-linux-gnu/test/ui/stats/macro-stats/macro-stats.stderr`
diff of stderr:

2 macro-stats MACRO EXPANSION STATS: macro_stats
3 macro-stats Macro Name                         Uses      Lines  Avg Lines      Bytes  Avg Bytes
4 macro-stats -----------------------------------------------------------------------------------
- macro-stats #[derive(Clone)]                      8         64        8.0      1_788      223.5
- macro-stats #[derive(PartialOrd)]                 1         17       17.0        675      675.0
- macro-stats #[derive(Hash)]                       2         17        8.5        577      288.5
+ macro-stats #[derive(Clone)]                      8         64        8.0      1_892      236.5
+ macro-stats #[derive(PartialOrd)]                 1         17       17.0        688      688.0
+ macro-stats #[derive(Hash)]                       2         17        8.5        603      301.5
8 macro-stats q!                                    1         26       26.0        519      519.0
- macro-stats #[derive(Ord)]                        1         15       15.0        503      503.0
- macro-stats #[derive(Default)]                    2         16        8.0        403      201.5
- macro-stats #[derive(Eq)]                         1         11       11.0        325      325.0
- macro-stats #[derive(Debug)]                      1          8        8.0        277      277.0
- macro-stats #[derive(PartialEq)]                  1          9        9.0        267      267.0
+ macro-stats #[derive(Ord)]                        1         15       15.0        516      516.0
+ macro-stats #[derive(Default)]                    2         16        8.0        429      214.5
+ macro-stats #[derive(Eq)]                         1         11       11.0        338      338.0
+ macro-stats #[derive(Debug)]                      1          8        8.0        290      290.0
+ macro-stats #[derive(PartialEq)]                  1          9        9.0        280      280.0
14 macro-stats #[derive(Copy)]                       1          2        2.0         61       61.0
15 macro-stats p!                                    1          3        3.0         32       32.0
16 macro-stats trait_impl_tys!                       1          2        2.0         28       28.0


The actual stderr differed from the expected stderr
To update references, rerun the tests and pass the `--bless` flag
To only update this specific test, also pass `--test-args stats/macro-stats.rs`

error: 1 errors occurred comparing output.
status: exit status: 0
command: env -u RUSTC_LOG_COLOR RUSTC_ICE="0" RUST_BACKTRACE="short" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/tests/ui/stats/macro-stats.rs" "-Zthreads=1" "-Zsimulate-remapped-rust-src-base=/rustc/FAKE_PREFIX" "-Ztranslate-remapped-path-to-local-path=no" "-Z" "ignore-directory-in-diagnostics-source-blocks=/cargo" "-Z" "ignore-directory-in-diagnostics-source-blocks=/checkout/vendor" "--sysroot" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2" "--target=aarch64-unknown-linux-gnu" "--check-cfg" "cfg(test,FALSE)" "--error-format" "json" "--json" "future-incompat" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zwrite-long-types-to-disk=no" "-Cstrip=debuginfo" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/aarch64-unknown-linux-gnu/test/ui/stats/macro-stats" "-A" "unused" "-A" "internal_features" "-A" "unused_parens" "-A" "unused_braces" "-Crpath" "-Cdebuginfo=0" "-Lnative=/checkout/obj/build/aarch64-unknown-linux-gnu/native/rust-test-helpers" "-Zmacro-stats"
stdout: none
--- stderr -------------------------------
macro-stats ===================================================================================
macro-stats MACRO EXPANSION STATS: macro_stats
macro-stats Macro Name                         Uses      Lines  Avg Lines      Bytes  Avg Bytes
macro-stats -----------------------------------------------------------------------------------
macro-stats #[derive(Clone)]                      8         64        8.0      1_892      236.5
macro-stats #[derive(PartialOrd)]                 1         17       17.0        688      688.0
macro-stats #[derive(Hash)]                       2         17        8.5        603      301.5
macro-stats q!                                    1         26       26.0        519      519.0
macro-stats #[derive(Ord)]                        1         15       15.0        516      516.0
macro-stats #[derive(Default)]                    2         16        8.0        429      214.5
macro-stats #[derive(Eq)]                         1         11       11.0        338      338.0
macro-stats #[derive(Debug)]                      1          8        8.0        290      290.0
macro-stats #[derive(PartialEq)]                  1          9        9.0        280      280.0
macro-stats #[derive(Copy)]                       1          2        2.0         61       61.0
macro-stats p!                                    1          3        3.0         32       32.0
macro-stats trait_impl_tys!                       1          2        2.0         28       28.0
macro-stats foreign_item!                         1          1        1.0         21       21.0
macro-stats long_name_that_doesnt_fit_on_one_line!
macro-stats                                       1          1        1.0         18       18.0
macro-stats impl_const!                           1          1        1.0         17       17.0
macro-stats trait_tys!                            1          2        2.0         15       15.0
macro-stats n99!                                  2          2        1.0          4        2.0
macro-stats none!                                 1          1        1.0          4        4.0
macro-stats u32!                                  1          1        1.0          3        3.0
macro-stats long_name_that_fits_on_one_line!     10         10        1.0          0        0.0
macro-stats long_name_that_fits_on_one_line_____! 1          1        1.0          0        0.0
macro-stats long_name_that_fits_on_one_line____!  1          1        1.0          0        0.0
macro-stats long_name_that_fits_on_one_line___!   1          1        1.0          0        0.0
macro-stats long_name_that_fits_on_one_line__!    1          1        1.0          0        0.0
macro-stats long_name_that_fits_on_one_line_!     1          1        1.0          0        0.0
macro-stats #[test]                               1          1        1.0          0        0.0
macro-stats ===================================================================================
------------------------------------------

---- [ui] tests/ui/stats/macro-stats.rs stdout end ----

failures:

@rust-bors
Copy link

rust-bors bot commented Oct 13, 2025

☀️ Try build successful (CI)
Build commit: 80cb3f7 (80cb3f7bd8b11635e353e418f1031e98281b41a4, parent: ed1d94311e7ed53eabb5667ef577802d88d1aaee)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (80cb3f7): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
5.1% [0.1%, 47.2%] 31
Regressions ❌
(secondary)
0.7% [0.1%, 12.0%] 36
Improvements ✅
(primary)
-1.2% [-4.6%, -0.1%] 27
Improvements ✅
(secondary)
-0.6% [-0.8%, -0.3%] 20
All ❌✅ (primary) 2.2% [-4.6%, 47.2%] 58

Max RSS (memory usage)

Results (primary 2.5%, secondary 2.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.0% [0.5%, 11.1%] 20
Regressions ❌
(secondary)
2.2% [1.2%, 4.7%] 7
Improvements ✅
(primary)
-2.6% [-5.2%, -0.6%] 6
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.5% [-5.2%, 11.1%] 26

Cycles

Results (primary 14.4%, secondary 28.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
19.7% [2.3%, 64.8%] 10
Regressions ❌
(secondary)
28.3% [18.3%, 33.2%] 4
Improvements ✅
(primary)
-3.2% [-3.8%, -2.3%] 3
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 14.4% [-3.8%, 64.8%] 13

Binary size

Results (primary 4.1%, secondary 0.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
7.1% [0.0%, 66.4%] 59
Regressions ❌
(secondary)
1.0% [0.0%, 20.2%] 25
Improvements ✅
(primary)
-1.6% [-8.3%, -0.0%] 31
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.0%] 3
All ❌✅ (primary) 4.1% [-8.3%, 66.4%] 90

Bootstrap: 473.894s -> 475.702s (0.38%)
Artifact size: 388.15 MiB -> 388.16 MiB (0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants