Description
@Kobzol and me have been discussing what the stage numbers in bootstrap should be "from first principles". Zulip thread
Here's a little diagram showing the core dependency loop: (miri is a stand-in for any tool depending on rustc crates, e.g. clippy, RA)
The "build" edge means "compiler is used to build that crate", a "dep" edge means "crate A is needed to build crate B". Miri of course also has a "dep" edge from the library, but I omitted that as it exists transitively.
Bootstrap needs to cut this loop, and since the recent stage redesign it does this as follows:
This is how ./x build
already behaves today: build $thing --stage $N
will build "$thing $N" from that diagram.
We can annotate the original diagram with staging offsets on the arrows to cut out the redundancy:
However, it turns out ./x check compiler
currently does not follow the same staging system as ./x build
. Running ./x check compiler --stage 1
will in fact do the equivalent of ./x build compiler --stage 2
! However, ./x check library --stage 1
does match ./x build library --stage 1
as expected. miri
behaves like compiler
, i.e., it is also off-by-one in check
.
I don't know why check
is off-by-1 from build
for everything except for library
, but it'd be great to make this consistent. :) Checking the stage 1 compiler should be using the stage 0 (beta) compiler to check the in-tree sources.
Cc @rust-lang/bootstrap
Bonus content
Here's a version of the above diagram with all "build" edges in it, with transitive "dep" edges made explicit, and with cargo as well as a stand-in for a tool that does not depend on compiler crates.
Key invariants:
- Every node has exactly one incoming "build" edge, which always comes from
compiler
. - Every loop must have at least one "+1" on an edge.
- If we have "compiler --build--> A" and "compiler --build--> B --dep--> A", then the sum of the stage bumps along both paths must be the same. This ensures a crate and its dependencies get built with the same compiler.
I strongly feel that we want the process to be described by some form of diagram like that, to make it all consistent. The degrees of freedom, given the above invariants, are few:
- In the one loop of the diagram, we could put the "+1" on the other edge. But then the sequence of steps that bootstrap would perform for a stage N compiler build would be: compiler 1, library 2, compiler 2, ... There would be no "library 1". That seems very strange.
- "cargo" could avoid the "+1" on both of its incoming edges. But that would mean a stage 1 cargo would be built with stage 1 rustc, i.e.
build --stage 1 cargo
would have to first do a full rustc build (or download the prebuilt binary) -- that seems entirely unnecessary, I see no benefits from doing this.
Given this design I think we can set stage 1 to be the default stage for almost every command (and treat the remaining commands where that does not behave well as bugs). There's no reason to have different default stages for test vs build vs check, or for different profiles, which hopefully means a bunch of complexity can be removed from bootstrap. :)
FWIW, --stage 0
does not really ever make sense as a concept in this world. IMO that should just always be an error, but @Kobzol suggested that'd break too many things for tools that don't depend on rustc, so maybe it should just be an alias for --stage 1
(with a warning that we could eventually make an error). We could also shift everything by 1 so that stage 0 becomes the first stage bootstrap builds, but then we'd need to consider the downloaded rustc+std to be "stage -1" and I don't think we want to go there.