Description
New summary:
I think we've all agreed by now that we will focus on the ICH (incremental compilation hash) and revisit the question of a SVH (stable version hash, used to detect ABI-incompatible changes) at some future date. The "original summary" (below) goes into some of the discussion.
I've thus repurposed this issue to target addressing the shortcomings of the existing hash for the purposes of incremental compilation. The following checklist is derived from @michaelwoerister's comment:
- incorporate results of name resolution pass, since name resolution is not tracked
- incorporate spans, when needed (Incremental compilation: Be smart about hashing spans #33888)
- ignore nested items, since the use of items is tracked independently
- note that
visit_stmt()
andvisit_decl()
by themselves should not change the hash, since they could just represent a nested item (which we either want to ignore completely or handle in a position independent way).
- note that
- the
CaptureClause
of closures should not be ignored -
walk_generics()
invisit_variant()
andvisit_variant_data()
are redundant; this is already handled byvisit_item()
- explicitly hashing the label names in
ExprLoop
,ExprBreak
, andExprAgain
is redundant,visit_name()
is called for those anyway (as already done forExprWhile
)
There are also optional refinements:
- do not hash the actual names of local variables etc, but just map them to unique identifiers or something like that; seems challenging to get something that is both deterministic, however, and also independent from the name itself.
Original summary:
In #32016, I (ab)used the SVH to serve as a hash for the purposes of incremental compilation. This is simply wrong, because the SVH intentionally omits some details that do not affect the ABI, but which very much affect whether code needs to be recompiled (e.g., the value of a constant). This needs to be fixed, but in one of two ways:
- Create an alternative incremental compilation hash (ICH), keeping the existing consumers of the SVH.
- Convert the SVH (in place) into the ICH, and make the existing consumers use that.
At this point, there are really very few users of the SVH:
- Each crate stores the SVH in its metadata, which the compiler then uses to detect transitive mismatches, where you compile crate B with some version of crate A, but you are now compiling C (which uses B) with a distinct version of crate A in scope.
- Debuginfo uses it for some reason or other.
- There is no third use. (I think)
The second use is unnecessary. The first use is interesting. It's unclear how flexible this check is. But if we used the ICH for this purpose, it would basically be equivalent to the SVH today (the SVH today is looser, as I wrote in the beginning, but effectively any change will cause the SVH to change, with very few exceptions).
Moreover, I claim that even if, someday, we wanted to modify the check to be true ABI compatibility, we probably wouldn't want a single SVH hash. We'd probably want each function hash to include information about the way the argument types are represented, so that we can detect mismatches on a fine-grained level, rather than just "some fn, which you may not even call, changed". But I'm not sure.
One problem with using the ICH everywhere, though, is that we must be careful about endianness. We'd prefer if cross-compiling did not affect the ICH.
I guess I've wasted more time writing this paragraph than it really merits. End of the day, the choice is:
- Keep the SVH around, weird as it is.
- Kill the SVH and just ICH, which should be stricter.