Skip to content

Conversation

@lolbinarycat
Copy link
Contributor

splitting these out should let us avoid quite a few more branches, and hopefully also improve cache locality of the code.

followup to #145851

@rustbot
Copy link
Collaborator

rustbot commented Aug 26, 2025

r? @GuillaumeGomez

rustbot has assigned @GuillaumeGomez.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Aug 26, 2025
@lolbinarycat
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Aug 26, 2025
rustdoc: split build_impl into build_{local,external}_impl
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Aug 26, 2025
@rust-bors
Copy link

rust-bors bot commented Aug 26, 2025

☀️ Try build successful (CI)
Build commit: c00bc1f (c00bc1f8148ea75996f48ef92478ecb1e7e7c9ed, parent: 91ee6a4057ce4bf1ab6d2f932cae497488d67c81)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c00bc1f): comparison URL.

Overall result: ❌ regressions - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
1.0% [0.2%, 3.0%] 20
Regressions ❌
(secondary)
1.9% [0.4%, 3.2%] 25
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [0.2%, 3.0%] 20

Max RSS (memory usage)

Results (primary -1.4%, secondary -2.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.7% [2.7%, 2.7%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.7% [-3.8%, -0.8%] 14
Improvements ✅
(secondary)
-2.0% [-4.2%, -1.0%] 17
All ❌✅ (primary) -1.4% [-3.8%, 2.7%] 15

Cycles

Results (primary -2.2%, secondary -2.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.2% [-2.2%, -2.2%] 1
Improvements ✅
(secondary)
-2.0% [-2.9%, -1.2%] 2
All ❌✅ (primary) -2.2% [-2.2%, -2.2%] 1

Binary size

Results (secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Bootstrap: 469.036s -> 466.669s (-0.50%)
Artifact size: 391.17 MiB -> 391.16 MiB (-0.00%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Aug 27, 2025
@lolbinarycat
Copy link
Contributor Author

because this is built off the previous performance PR, here's the comparison URL that should actually be used: https://perf.rust-lang.org/compare.html?start=05c0f818fe424341111ec6f6ccc79df099d0a142&end=c00bc1f8148ea75996f48ef92478ecb1e7e7c9ed&stat=instructions%3Au&opt=false&debug=false&check=false

quite discouraging actually, and i'm not sure how to intuit this. you would think removing branches would speed things up, at least a little bit.

@GuillaumeGomez
Copy link
Member

Hard to compare times. The main comparison is how many instructions were run, so I guess improving branch prediction doesn't allow to do that.

@lolbinarycat
Copy link
Contributor Author

I mean, removing a branching instruction should save an instruction, in theory.

but looking at branch-misses, that metric also regressed.

i think what's happening here is the tiny bit of duplicated code at the start is getting inlined into actually a fairly significant amount of instructions, including a branch, and that's canceling out any improvements (i believe those old removed match statements actually branch predict very well, since they get executed a bunch with locals, then a bunch with external impls).

there might be something i can do to make this worth it, so i'll keep this open for now, but i'm not confident.

@bors
Copy link
Collaborator

bors commented Sep 27, 2025

☔ The latest upstream changes (presumably #138907) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants