Description
Overview
Currently, the invariants enforced by Ptr
are used strictly as lower bounds. We require code to prove soundness in order to increase the strictness of an invariant, for example to go from any alignment to Aligned
:
Lines 779 to 781 in 1b037c8
...but we permit the opposite direction without restriction:
Line 936 in 1b037c8
For alignment, this is sound, but it is not sound for validity as recently illustrated by #2226.
This issue tracks updating Ptr
's validity modeling to make it fully sound.
Proof of Concept
Currently, we model all invariants as lower bounds. Conceptually, an invariant describes what is known about a Ptr
or its referent at a particular point in the program. Making stronger assertions (e.g., asserting that a Ptr
with Any
validity actually refers to a bit-valid value, and thus upgrading its validity to Valid
) is not sound in the general case. By contrast, making weaker assertions is always sound because it simply entails "forgetting" what is known (e.g., going from Valid
to Any
).
This works for alignment, but does not work for aliasing (see #1889) or validity. Consider, for example, adding two seemingly innocuous methods to Ptr
:
forget_valid
modifies the validity invariant on an existingPtr
to beAny
as_maybe_uninit
turns aPtr<T>
into aPtr<MaybeUninit<T>, (..., ..., Valid)>
(sinceMaybeUninit
has no validity requirements)
Taken together, these can be used to produce UB:
Lines 946 to 974 in a3d4387
Indeed, Miri detects this UB:
test pointer::ptr::test_forget_valid ... error: Undefined Behavior: constructing invalid value: encountered 0x02, but expected a boolean
--> src/pointer/ptr.rs:974:5
|
974 | assert_eq!(b, true);
| ^^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered 0x02, but expected a boolean
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE on thread `pointer::ptr::test_forget_valid`:
= note: inside `pointer::ptr::test_forget_valid` at ~/.rustup/toolchains/nightly-2024-10-09-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/macros/mod.rs:46:22: 46:31
note: inside closure
--> src/pointer/ptr.rs:945:23
|
944 | #[test]
| ------- in this procedural macro expansion
945 | fn test_forget_valid() {
| ^
= note: this error originates in the macro `assert_eq` which comes from the expansion of the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
error: aborting due to 1 previous error; 1 warning emitted
error: test failed, to rerun pass `--lib`
Mental model
As described above, we have historically thought of a ptr: Ptr<T, Valid>
as encoding the knowledge that ptr
's referent is a bit-valid T
. Naively, I should be able to convert this to mu: Ptr<MaybeUninit<T>, Valid>
regardless of t
's validity invariant; after all, all byte sequences are valid instances of MaybeUninit<T>
. However, if we rephrase this in terms of references, that's clearly invalid! Going from &mut T
to &mut MaybeUninit<T>
is unsound, as the latter reference can be used to write arbitrary bytes to a memory location which is referenced as a &mut T
.
This highlights that we need to not only consider the current type (i.e., the T
in Ptr<T>
), but also the types of all references/Ptr
s from which the current Ptr
is reborrowed (its "ancestors", to coin a phrase). If we do this, then we can re-conceptualize the invariant as not only providing knowledge about the current value of the Ptr
's referent, but also a constraint about what values are permitted to be written to the referent. In particular, all values written to the referent must be valid for all ancestor references/Ptr
s.
Note that this only affects validity, but not aliasing or alignment. Aliasing suffers from a similar but different problem, described in #1889. Alignment is a property of a particular Ptr
(and not of its ancestors) and so it really is just a matter of knowledge, not a constraint (in other words, "forgetting" that a Ptr
is aligned can never cause problems).
Thus, we can re-conceptualize the validity invariant in the following way:
- A
Ptr<T, Validity>
has typeT
and validityValidity
- A pair of type and validity,
(T, Validity)
, defines a set of valid values,V
. For example,(bool, Valid)
defines the set{0x00, 0x01}
, and(bool, AsInitialized)
defines the set{0x00, ..., 0x255}
. Note that changing the type and changing the validity can both affectV
. - Code is permitted to assume that only values in
V
will be present in the referent - Code must promise to only write values in
V
to the referent - Code may only change
V
in a way that doesn't violate the preceding two requirements. As mentioned above, this implicates both invariant changes and type transmutations.