Description
For a while it has been understood that the mem::uninitialized
API is broken. Originally the intuitive understanding of this API was that it produced a fixed, arbitrary value. However (as extensively discussed elsewhere) uninitialized memory is not a “fixed, arbitrary value”, and that for nearly all types in Rust it is instantaneous undefined behavior for them to be uninitialized.
What’s worse, even initialized values can be insta-UB. Rust uses its understanding of valid bit patterns to perform layout optimizations whereby invalid values can be repurposed as enum tags, which is how Option<&T>
can be only a single word. Thus even mem::uninitialized
’s sibling mem::zeroed
is insta-UB when used with types like &T
.
As a result, mem::uninitialized
was deprecated and replaced with mem::MaybeUninit
, which avoids the problems of the former. In addition, both mem::zeroed
and mem::uninitialized
were altered such that they will attempt to detect (and panic) when used on certain types: the former on types that must not be zero, and the latter on any types with invalid (defined) values.
However, implementing these panic checks caused a great deal of breakage (which arguably is desirable for safety, although still extremely disruptive), and to reduce disruption the check is conservative instead of exhaustive (#66151). Unfortunately, while improving the coverage of these checks will still leave mem::zeroed
as perfectly usable, mem::uninitialized
will be rendered all but unusable, as essentially all types cannot ever be in an uninitialized state.
This is a problem for legacy crates that were never migrated away from mem::uninitialized
. However, there is a solution that both allows these legacy crates to compile while also avoiding the problem of invalid uninitialized values: mem::uninitialized
can initialize with a valid value. This may seem contrary to the original intent of the API, but consider that the only reason to avoid initialization is performance, and that the choice is now between “my code doesn’t compile”, “my code contains undefined behavior”, and “my code is slower”; the latter is the most desirable outcome of the three.
This raises the question: what value to initialize with? PR #87032 proposed the simplest option, which was to replace the innards of mem::uninitialized
with mem::zeroed
, however zero is the value that is most often used for niche optimizations, so this would still reject a lot of code.
But there is a more desirable alternative. Because Rust understands what values are invalid for a type—it must, in order to perform niche optimizations—it therefore should also understand which values are valid for a type. An intrinsic could be added to the compiler which, given a type, produces an arbitrary valid value of that type. This intrinsic could be used within mem::uninitialized
, and the existing panic check could be removed. This would allow all code in the wild still using mem::uninitialized
to compile, and would also avoid all insta-UB related to validity invariants.