Skip to content

Overhaul Ptr's validity invariant modeling #1866

Closed
@joshlf

Description

@joshlf

Overview

Currently, the invariants enforced by Ptr are used strictly as lower bounds. We require code to prove soundness in order to increase the strictness of an invariant, for example to go from any alignment to Aligned:

zerocopy/src/pointer/ptr.rs

Lines 779 to 781 in 1b037c8

pub(crate) unsafe fn assume_alignment<A: Alignment>(
self,
) -> Ptr<'a, T, (I::Aliasing, A, I::Validity)> {

...but we permit the opposite direction without restriction:

pub fn forget_aligned(self) -> Ptr<'a, T, (I::Aliasing, Any, I::Validity)> {

For alignment, this is sound, but it is not sound for validity as recently illustrated by #2226.

This issue tracks updating Ptr's validity modeling to make it fully sound.

Proof of Concept

Currently, we model all invariants as lower bounds. Conceptually, an invariant describes what is known about a Ptr or its referent at a particular point in the program. Making stronger assertions (e.g., asserting that a Ptr with Any validity actually refers to a bit-valid value, and thus upgrading its validity to Valid) is not sound in the general case. By contrast, making weaker assertions is always sound because it simply entails "forgetting" what is known (e.g., going from Valid to Any).

This works for alignment, but does not work for aliasing (see #1889) or validity. Consider, for example, adding two seemingly innocuous methods to Ptr:

  • forget_valid modifies the validity invariant on an existing Ptr to be Any
  • as_maybe_uninit turns a Ptr<T> into a Ptr<MaybeUninit<T>, (..., ..., Valid)> (since MaybeUninit has no validity requirements)

Taken together, these can be used to produce UB:

zerocopy/src/pointer/ptr.rs

Lines 946 to 974 in a3d4387

use core::mem::MaybeUninit;
impl<'a, T, I: Invariants> Ptr<'a, T, I> {
/// Forgets that `self`'s referent is a bit-valid `T`.
fn forget_valid(self) -> Ptr<'a, T, (I::Aliasing, I::Alignment, Any)> {
// SAFETY: `Any` is less restrictive than any validity invariant.
unsafe { self.assume_invariants() }
}
fn as_maybe_uninit(self) -> Ptr<'a, MaybeUninit<T>, (I::Aliasing, I::Alignment, Valid)> {
// SAFETY: `MaybeUninit<T>` has the same size and `UnsafeCell`s as
// `T`. This is a provenance-preserving cast.
let ptr = unsafe { self.cast_unsized(|ptr| ptr.cast::<MaybeUninit<T>>()) };
// SAFETY: `MaybeUninit<T>` has the same alignment as `T`.
let ptr = unsafe { ptr.assume_alignment::<I::Alignment>() };
// SAFETY: `MaybeUninit<T>` has no validity requirements.
unsafe { ptr.assume_valid() }
}
}
let mut b = true;
let ptr = Ptr::from_mut(&mut b);
let ptr = ptr.forget_valid();
let ptr = ptr.as_maybe_uninit();
let mu = ptr.as_mut();
*mu = crate::transmute!(2u8);
assert_eq!(b, true);

Indeed, Miri detects this UB:

test pointer::ptr::test_forget_valid ... error: Undefined Behavior: constructing invalid value: encountered 0x02, but expected a boolean
   --> src/pointer/ptr.rs:974:5
    |
974 |     assert_eq!(b, true);
    |     ^^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered 0x02, but expected a boolean
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE on thread `pointer::ptr::test_forget_valid`:
    = note: inside `pointer::ptr::test_forget_valid` at ~/.rustup/toolchains/nightly-2024-10-09-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/macros/mod.rs:46:22: 46:31
note: inside closure
   --> src/pointer/ptr.rs:945:23
    |
944 | #[test]
    | ------- in this procedural macro expansion
945 | fn test_forget_valid() {
    |                       ^
    = note: this error originates in the macro `assert_eq` which comes from the expansion of the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error; 1 warning emitted

error: test failed, to rerun pass `--lib`

Mental model

As described above, we have historically thought of a ptr: Ptr<T, Valid> as encoding the knowledge that ptr's referent is a bit-valid T. Naively, I should be able to convert this to mu: Ptr<MaybeUninit<T>, Valid> regardless of t's validity invariant; after all, all byte sequences are valid instances of MaybeUninit<T>. However, if we rephrase this in terms of references, that's clearly invalid! Going from &mut T to &mut MaybeUninit<T> is unsound, as the latter reference can be used to write arbitrary bytes to a memory location which is referenced as a &mut T.

This highlights that we need to not only consider the current type (i.e., the T in Ptr<T>), but also the types of all references/Ptrs from which the current Ptr is reborrowed (its "ancestors", to coin a phrase). If we do this, then we can re-conceptualize the invariant as not only providing knowledge about the current value of the Ptr's referent, but also a constraint about what values are permitted to be written to the referent. In particular, all values written to the referent must be valid for all ancestor references/Ptrs.

Note that this only affects validity, but not aliasing or alignment. Aliasing suffers from a similar but different problem, described in #1889. Alignment is a property of a particular Ptr (and not of its ancestors) and so it really is just a matter of knowledge, not a constraint (in other words, "forgetting" that a Ptr is aligned can never cause problems).

Thus, we can re-conceptualize the validity invariant in the following way:

  • A Ptr<T, Validity> has type T and validity Validity
  • A pair of type and validity, (T, Validity), defines a set of valid values, V. For example, (bool, Valid) defines the set {0x00, 0x01}, and (bool, AsInitialized) defines the set {0x00, ..., 0x255}. Note that changing the type and changing the validity can both affect V.
  • Code is permitted to assume that only values in V will be present in the referent
  • Code must promise to only write values in V to the referent
  • Code may only change V in a way that doesn't violate the preceding two requirements. As mentioned above, this implicates both invariant changes and type transmutations.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions