Skip to content

Include additional hashes in src/stage0 #142139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erickt
Copy link
Contributor

@erickt erickt commented Jun 7, 2025

This patch changes bump-stage0 to include:

  • The sha256 hash of the channel manifest used to create src/stage0.
  • The rust and rustfmt git commit in src/stage0.
  • Hashes of all the artifacts, like the source tarball, in src/stage0.

Combined this will allow for:

  • Projects that bootstrap their own compiler, such as Fuchsia, or users of bootstrap, to build their compilers offline without needing to communicate with static.rust-lang.org.

  • Auditors to detect if the channel manifest, and all the artifacts inside the manifest, were modified after it was used to generate src/stage0. Furthermore, if they did find modified artifacts, they could determine if the Rust Signing Key was compromised by checking if any modified file was signed properly.

finally, it allows regeneration of src/stage0 when specifying both the day of the build for rust, and the day of the build for rustfmt, which can allow a maintainer to regenerate src/stage0 to verify nothing changed.

This patch changes `bump-stage0` to include:

* The sha256 hash of the channel manifest used to create `src/stage0`.
* The rust and rustfmt git commit in `src/stage0`.
* Hashes of all the artifacts, like the source tarball, in `src/stage0`.

Combined this will allow for:

* Projects that bootstrap their own compiler, such as Fuchsia, or users
  of [bootstrap], to build their compilers offline without needing to
  communicate with static.rust-lang.org.

* Auditors to detect if the channel manifest, and all the artifacts
  inside the manifest, were modified after it was used to generate
  `src/stage0`. Furthermore, if they did find modified artifacts, they
  could determine if the Rust Signing Key was compromised by checking if
  any modified file was signed properly.

finally, it allows regeneration of `src/stage0` when specifying both the
day of the build for rust, and the day of the build for rustfmt, which
can allow a maintainer to regenerate `src/stage0` to verify nothing
changed.

[bootstrap]: https://github.com/dtolnay/bootstrap
[mrustc]: https://github.com/thepowersgang/mrustc
@rustbot
Copy link
Collaborator

rustbot commented Jun 7, 2025

r? @onur-ozkan

rustbot has assigned @onur-ozkan.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-release Relevant to the release subteam, which will review and decide on the PR/issue. labels Jun 7, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 7, 2025

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@onur-ozkan
Copy link
Member

r? release

@rustbot rustbot assigned BoxyUwU and unassigned onur-ozkan Jun 7, 2025
@BoxyUwU
Copy link
Member

BoxyUwU commented Jun 10, 2025

r? @Mark-Simulacrum

@rustbot rustbot assigned Mark-Simulacrum and unassigned BoxyUwU Jun 10, 2025
@Mark-Simulacrum
Copy link
Member

Projects that bootstrap their own compiler, such as Fuchsia, or users of bootstrap, to build their compilers offline without needing to communicate with static.rust-lang.org.

Can you help me understand how this enables this where it wasn't before? AFAIK, many distros and companies already build the compiler offline -- just providing the stage0 compiler via a pre-step that is online-capable, putting paths to it in our existing bootstrap.toml flags. So what is this actually enabling?

Auditors to detect if the channel manifest, and all the artifacts inside the manifest, were modified after it was used to generate src/stage0. Furthermore, if they did find modified artifacts, they could determine if the Rust Signing Key was compromised by checking if any modified file was signed properly.

I'm not sure I understand the value of auditing the manifest separately from what's already done. The stage0 file in-tree already verifies all of the downloaded artifacts have stable hashes. We effectively have a trust-on-first-use model here today -- and those checksums are verified by bootstrap.py when downloading from static.

Checking whether the publicly available manifest hasn't changed doesn't seem like it affects rust-lang/rust's build at all. If the goal is an independent auditor of whether static.rust-lang.org artifacts are (or aren't) changing, then that seems fine, but placing that burden on rust-lang/rust's stage0 file doesn't seem right to me.

Can you say more about why this belongs here, especially given that we'd not be verifying the new hashes day to day in CI or similar?

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 12, 2025
@erickt
Copy link
Contributor Author

erickt commented Jun 13, 2025

Hello Mark! Inlined below.

Can you help me understand how this enables this where it wasn't before? AFAIK, many distros and companies already build the compiler offline -- just providing the stage0 compiler via a pre-step that is online-capable, putting paths to it in our existing bootstrap.toml flags. So what is this actually enabling?

Sure thing. This is coming from our experience with how Fuchsia is automatically updating our toolchain. We track both LLVM and Rust tip-of-tree together so we can take advantage of rapid deployment of features and bug fixes, as well as cross-language LTO, PGO, sanitizers, and etc. This exposes us to a variety of issues, such bugs on our side, multiple stage0 updates for a single release (such as with #140887 and #141647), or breakages in LLVM or Rust. All of which could cause us to not being able to build Rust for days or weeks. So we've automated our toolchain compilation to:

  1. Check out some version of Rust.
  2. Parse src/stage0 to find the stage0 date and version.
  3. Download the channel manifest from static.rust-lang.org and extract the stage0 git commit.
  4. Check if we've built the stage0 git commit, if so, build Rust. Otherwise, recurse into (1), then build Rust.

This also has the nice property that it'd be easy to re-bootstrap our toolchain in case we discover any issues with our build process, such as if we discover any issues, like a Reflections on Trusting Trust attack. To make this process more secure, it'd be nice if we didn't have to communicate with static.rust-lang.org to find the older stage0 commit.

The other option I considered was that instead of downloading the channel manifest, we could read src/stage0, then look up the git commit for the corresponding version tag. But that'd only work for old builds, it wouldn't work for unreleased versions. We'd have to instead use the beta branch but then it wouldn't be reproducible.

I'm not sure I understand the value of auditing the manifest separately from what's already done. The stage0 file in-tree already verifies all of the downloaded artifacts have stable hashes. We effectively have a trust-on-first-use model here today -- and those checksums are verified by bootstrap.py when downloading from static.

Checking whether the publicly available manifest hasn't changed doesn't seem like it affects rust-lang/rust's build at all. If the goal is an independent auditor of whether static.rust-lang.org artifacts are (or aren't) changing, then that seems fine, but placing that burden on rust-lang/rust's stage0 file doesn't seem right to me.

I agree that bootstrap.py is robust and protects against changes when compiling Rust. This is mainly about supporting projects that bootstrap their own Rust toolchain. For Fuchsia, we mainly just need the git commit. I could see other bootstrappers wanting to work with stage0 source tarballs though, so I wanted to get those hashes somewhere as well. I added the rest for thoroughness.

Regarding if this should be in src/stage0, I'd be fine moving this elsewhere if you'd be receptive to it. Maybe we could either download the channel manifests into a src/stage0-history directory, or create a separate file like src/stage0-history.toml that'd look something like:

[[stage0]]
channel_manifest_hash = "abcd..."
git_commit = "bcde..."
...

[[stage0]]
channel_manifest_hash = "1234..."
git_commit = "2345..."
...

This also would have the side benefits:

  • if there was a known bad stage0 we could just remove it from src/stage0-history.toml, rather than each bootstrapper having to implement custom logic to workaround a bad version.
  • it could help with disaster recovery if somehow all the data backing static.rust-lang.org ever got deleted.

I'd be happy to make any other changes or discuss further if you'd like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. T-release Relevant to the release subteam, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants