Skip to content

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Oct 4, 2025

I saw a bunch of dead, empty <[core::mem::maybe_uninit::MaybeUninit<T>; N] as core::array::iter::iter_inner::PartialDrop>::partial_drop functions when compiling with more than 1 CGU.

Let's see if we can help optimizations to eliminate stuff earlier.

r? ghost

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 4, 2025
@the8472 the8472 force-pushed the nondrop-array-iter branch from 3116ce6 to 7f10971 Compare October 4, 2025 22:34
@the8472
Copy link
Member Author

the8472 commented Oct 4, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 4, 2025
only call polymorphic array iter drop machinery when the type requires it
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 4, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 5, 2025

☀️ Try build successful (CI)
Build commit: 7a5c0c2 (7a5c0c26f80ea669941a7c760c9d0e4ebbfc8a23, parent: 2cb4e7dce84fdebc0279159f1082f92b99299d87)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7a5c0c2): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.6% [0.6%, 0.6%] 1
Regressions ❌
(secondary)
0.9% [0.8%, 1.0%] 6
Improvements ✅
(primary)
-0.1% [-0.1%, -0.1%] 1
Improvements ✅
(secondary)
-0.3% [-0.3%, -0.3%] 2
All ❌✅ (primary) 0.3% [-0.1%, 0.6%] 2

Max RSS (memory usage)

Results (primary 0.7%, secondary 2.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.0% [0.7%, 5.1%] 4
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
-4.0% [-5.6%, -2.4%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [-5.6%, 5.1%] 6

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.1%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.1%] 6
Regressions ❌
(secondary)
0.1% [0.1%, 0.1%] 1
Improvements ✅
(primary)
-0.2% [-0.5%, -0.0%] 14
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.1% [-0.5%, 0.1%] 20

Bootstrap: 471.593s -> 470.766s (-0.18%)
Artifact size: 388.35 MiB -> 388.32 MiB (-0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 5, 2025
@the8472 the8472 force-pushed the nondrop-array-iter branch from d418872 to 8e79e6d Compare October 5, 2025 09:02
@the8472
Copy link
Member Author

the8472 commented Oct 5, 2025

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 5, 2025
only call polymorphic array iter drop machinery when the type requires it
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 5, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 5, 2025

☀️ Try build successful (CI)
Build commit: 61c5b78 (61c5b789d1ab11492050b0106aac2e724cf4a4d9, parent: e2c96cc06bdbdbc6f59c7551194d6a742260d6ff)

@rust-timer

This comment has been minimized.

@rust-timer

This comment was marked as outdated.

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 5, 2025
@the8472 the8472 force-pushed the nondrop-array-iter branch from 8e79e6d to 7f10971 Compare October 5, 2025 13:18
@the8472 the8472 marked this pull request as ready for review October 5, 2025 13:20
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 5, 2025
@the8472
Copy link
Member Author

the8472 commented Oct 5, 2025

First perf run looks better, going with that one. Binary sizes are down on average, including rustc artifact sizes.
The wg-grammar instruction regression... the history looks like it's just oscillating? So maybe noise.

r? @scottmcm

@scottmcm
Copy link
Member

scottmcm commented Oct 5, 2025

Is there any way we can have a test for the original problem? This feels like the kind of change that a well-meaning person could easily undo later without realizing it.

@the8472
Copy link
Member Author

the8472 commented Oct 5, 2025

On godbolt it only reproduces when linking a lib crate and then disassembling it. Using emit-asm doesn't work because it forces 1CGU. https://rust.godbolt.org/z/78fMrGGxc

I could try writing a run-make test but seems a bit overkill.

@the8472
Copy link
Member Author

the8472 commented Oct 5, 2025

But I can add a comment.

@the8472 the8472 force-pushed the nondrop-array-iter branch from 7f10971 to ff91dbd Compare October 5, 2025 19:03
@scottmcm
Copy link
Member

Sure. With the comment it seems good.

@bors r+

@bors
Copy link
Collaborator

bors commented Oct 14, 2025

📌 Commit ff91dbd has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 14, 2025
@bors
Copy link
Collaborator

bors commented Oct 14, 2025

⌛ Testing commit ff91dbd with merge fb24b04...

@bors
Copy link
Collaborator

bors commented Oct 14, 2025

☀️ Test successful - checks-actions
Approved by: scottmcm
Pushing fb24b04 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 14, 2025
@bors bors merged commit fb24b04 into rust-lang:master Oct 14, 2025
11 checks passed
@rustbot rustbot added this to the 1.92.0 milestone Oct 14, 2025
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 4b94758 (parent) -> fb24b04 (this PR)

Test differences

Show 2 test diffs

2 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard fb24b04b096a980bffd80154f6aba22fd07cb3d9 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-linux: 5951.9s -> 8676.9s (45.8%)
  2. aarch64-apple: 10239.0s -> 13328.0s (30.2%)
  3. dist-x86_64-apple: 8603.9s -> 6737.0s (-21.7%)
  4. x86_64-rust-for-linux: 2774.5s -> 3142.5s (13.3%)
  5. i686-msvc-1: 9395.1s -> 10608.9s (12.9%)
  6. aarch64-msvc-1: 7151.9s -> 6503.8s (-9.1%)
  7. x86_64-mingw-2: 7829.5s -> 7215.6s (-7.8%)
  8. x86_64-gnu-stable: 7439.1s -> 6924.8s (-6.9%)
  9. dist-loongarch64-musl: 5375.2s -> 5046.4s (-6.1%)
  10. aarch64-gnu-debug: 4220.6s -> 4477.2s (6.1%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (fb24b04): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.7% [0.7%, 0.7%] 2
Regressions ❌
(secondary)
0.3% [0.1%, 1.1%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.0%] 2
All ❌✅ (primary) 0.7% [0.7%, 0.7%] 2

Max RSS (memory usage)

Results (primary 0.6%, secondary 3.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.6% [2.0%, 3.4%] 3
Regressions ❌
(secondary)
4.3% [1.1%, 5.8%] 13
Improvements ✅
(primary)
-2.5% [-3.7%, -1.3%] 2
Improvements ✅
(secondary)
-2.3% [-2.4%, -2.1%] 3
All ❌✅ (primary) 0.6% [-3.7%, 3.4%] 5

Cycles

Results (secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.5% [1.3%, 5.4%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.6% [-3.9%, -1.6%] 4
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.3%] 6
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.3%, -0.0%] 21
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 1
All ❌✅ (primary) -0.1% [-0.3%, 0.3%] 27

Bootstrap: 475.312s -> 474.864s (-0.09%)
Artifact size: 388.16 MiB -> 388.45 MiB (0.07%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants