Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid string validation in rustc_serialize, check a marker byte instead #91407

Merged
merged 1 commit into from
Dec 8, 2021

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Nov 30, 2021

Since the serialization format isn't self-describing we need a way to detect when encoder and decoder don't match up. But for strings it doesn't have to be utf8 validation, which currently does cost a few percent of performance.
Instead we can use a marker byte at the end to be reasonably sure that we're dealing with a string and it wasn't overwritten in some way.

@the8472
Copy link
Member Author

the8472 commented Nov 30, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 30, 2021
@bors
Copy link
Contributor

bors commented Nov 30, 2021

⌛ Trying commit 5bd72a00734a32ba84c4e23330deed42eb5891fa with merge ae738380b413f7adb5c0254e2c1dc1430fb4e8dd...

@bors
Copy link
Contributor

bors commented Dec 1, 2021

☀️ Try build successful - checks-actions
Build commit: ae738380b413f7adb5c0254e2c1dc1430fb4e8dd (ae738380b413f7adb5c0254e2c1dc1430fb4e8dd)

@rust-timer
Copy link
Collaborator

Queued ae738380b413f7adb5c0254e2c1dc1430fb4e8dd with parent 207c80f, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ae738380b413f7adb5c0254e2c1dc1430fb4e8dd): comparison url.

Summary: This change led to large relevant mixed results 🤷 in compiler performance.

  • Large improvement in instruction counts (up to -3.0% on full builds of helloworld)
  • Large regression in instruction counts (up to 4.0% on incr-unchanged builds of deep-vector)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 1, 2021
@the8472
Copy link
Member Author

the8472 commented Dec 5, 2021

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 5, 2021
@bors
Copy link
Contributor

bors commented Dec 5, 2021

⌛ Trying commit 20daf6d16f9431fde26c0587836867402aa71017 with merge cc72756a7ee0284c9fa1dd21416a46ecd8f46455...

@bors
Copy link
Contributor

bors commented Dec 5, 2021

☀️ Try build successful - checks-actions
Build commit: cc72756a7ee0284c9fa1dd21416a46ecd8f46455 (cc72756a7ee0284c9fa1dd21416a46ecd8f46455)

@rust-timer
Copy link
Collaborator

Queued cc72756a7ee0284c9fa1dd21416a46ecd8f46455 with parent 772d51f, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (cc72756a7ee0284c9fa1dd21416a46ecd8f46455): comparison url.

Summary: This change led to very large relevant mixed results 🤷 in compiler performance.

  • Large improvement in instruction counts (up to -2.8% on full builds of helloworld)
  • Very large regression in instruction counts (up to 5.0% on incr-unchanged builds of inflate)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 6, 2021
@the8472
Copy link
Member Author

the8472 commented Dec 6, 2021

incr-unchanged inflate is most likely incremental verification noise, so it's mostly a win. Adding the sentinel byte makes the win a little smaller compared to just skipping the utf8 validation without replacement.

r? rust-lang/compiler

@the8472 the8472 marked this pull request as ready for review December 6, 2021 10:11
@the8472 the8472 added the perf-regression-triaged The performance regression has been triaged. label Dec 6, 2021
@the8472 the8472 changed the title perf test: check if utf8 validation of serialization is significant Avoid string validation in rustc_serialize, check a marker byte instead Dec 6, 2021
Copy link
Member

@michaelwoerister michaelwoerister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great find -- thanks, @the8472!

I left a couple of comments below.

compiler/rustc_serialize/src/opaque.rs Outdated Show resolved Hide resolved
compiler/rustc_serialize/src/opaque.rs Show resolved Hide resolved
since the serialization format isn't self-describing we need a way to detect
when encoder and decoder don't match up. but that doesn't have to
be utf8 validation for strings, which does cost a few % of performance.
Instead we can use a marker byte at the end to be reasonably
sure that we're dealing with a string and it wasn't overwritten in some
way.
@the8472 the8472 added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 6, 2021
@michaelwoerister
Copy link
Member

Very nice :) Thanks again, @the8472!
@bors r+

The perf run shows a single regression in incr-unchanged debug builds of inflate -- but it is very unlikely that the change here could cause such a regression, especially while at same time speeding up incr-unchanged check and incr-unchanged opt builds for the same crate. I suspect it is unrelated noise. But even if the regression were real, it is definitely outweighed by the large improvements in so many other crates.

@rustbot label: +perf-regression-triaged

@bors
Copy link
Contributor

bors commented Dec 7, 2021

📌 Commit c640f31 has been approved by michaelwoerister

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 7, 2021
@bors
Copy link
Contributor

bors commented Dec 7, 2021

⌛ Testing commit c640f31 with merge 477fd70...

@bors
Copy link
Contributor

bors commented Dec 8, 2021

☀️ Test successful - checks-actions
Approved by: michaelwoerister
Pushing 477fd70 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 8, 2021
@bors bors merged commit 477fd70 into rust-lang:master Dec 8, 2021
@rustbot rustbot added this to the 1.59.0 milestone Dec 8, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (477fd70): comparison url.

Summary: This change led to large relevant improvements 🎉 in compiler performance.

  • Large improvement in instruction counts (up to -2.9% on full builds of helloworld)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

@rustbot rustbot removed the perf-regression Performance regression. label Dec 8, 2021
the8472 added a commit to the8472/rust that referenced this pull request Dec 9, 2021
rust-lang#91407 changed the serialization format which leads to ICEs for nightly users such as rust-lang#91663 and linked issue.
Bumping the metadata version should lead to the cached files being discarded instead.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 11, 2021
…Mark-Simulacrum

Bump rmeta version to fix rustc_serialize ICE

rust-lang#91407 changed the serialization format which leads to ICEs for nightly users such as rust-lang#91663 and linked issues. The issue can be solved by running `cargo clean`. But bumping the metadata version should lead to the cached files being discarded, avoiding the issue entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants