Skip to content

Design for unsafe fields polyfill #1931

Open

Description

Sometimes, struct (or enum or union) fields have safety invariants:

struct EvenUsize {
    // INVARIANT: `n` is even.
    n: usize,
}

However, Rust has no mechanism to ensure that reads from or writes to fields with invariants must happen inside an unsafe block as is required for unsafe functions or to implement unsafe traits. I propose that we can support this behavior outside the language, allowing a user to write:

struct EvenUsize {
    // INVARIANT: `n` is even.
    #[unsafe]
    n: usize,
}

The #[unsafe] attribute (implemented as a proc macro attribute) modifies the type of n to something like Unsafe<usize>, although as we'll see in a moment, it needs to be a tad more complex than that.

Design take 1: Unsafe<T>

Design take 1 is currently prototyped in #1929.

Let's start with the basic design, though:

#[repr(transparent)]
pub struct Unsafe<T>(T);

An Unsafe<T> is a type whose constructors and accessors are all unsafe to call. We can't prevent code from moving an Unsafe<T> around, but we can prevent code from doing anything with it. Thus, for example, we might imagine the following constructor for EvenUsize, calling the unsafe Unsafe::new constructor:

impl EvenUsize {
    /// Constructs a new `EvenUsize`.
    ///
    /// Returns `None` if `n` is odd.
    pub fn new(n: usize) -> Option<EvenUsize> {
        if n % 2 != 0 {
            return None;
        }
        // SAFETY: We just confirmed that `n` is even.
        let n = unsafe { Unsafe::new(n) };
        Some(EvenUsize { n })
    }
}

While this gets us a step in the right direction, it has a soundness hole: There is nothing to stop Unsafes from two different types being swapped:

struct OddUsize {
    // INVARIANT: `n` is odd.
    #[unsafe]
    n: usize,
}

Code operating on an EvenUsize and an OddUsize could swap the inner Unsafe<usize>s safely, which is obviously bad!

Design take 2: Unsafe<T, F>

To prevent this from happening, we can instead design Unsafe to also take a parameter which is the type of the field's outer type:

#[repr(transparent)]
pub struct Unsafe<T, F>(PhantomData<T>, F);

The #[unsafe] attribute would then modify the code to something like:

struct OddUsize {
    n: Unsafe<OddUsize, usize>,
}

This prevents swapping fields between types, but it doesn't prevent swapping between fields:

struct Pair {
    // INVARIANT: `m` is even.
    #[unsafe]
    m: usize,

    // INVARIANT: `n` is odd.
    #[unsafe]
    n: usize,
}

m and n have the same type, and so can be swapped without issue.

Design take 3: Unsafe<T, F, NAME>

To prevent this from happening, we can instead design Unsafe to also include the name of the field in its type:

#[repr(transparent)]
pub struct Unsafe<T, F, const NAME: &'static str>(PhantomData<T>, F);

Unfortunately, &str is not supported in const parameters right now, so we'll instead have to use a u64 which is the hash of the name:

#[repr(transparent)]
pub struct Unsafe<T, F, const NAME_HASH: u64>(PhantomData<T>, F);

This is ugly, but the user will never see this code, as it will be generated by the proc macro attribute. The example above would expand to:

struct Pair {
    m: Unsafe<Pair, usize, {hash("m")}>,
    n: Unsafe<Pair, usize, {hash("n")}>,
}

This gets us most of the way there. It's still possible to swap between instances of the same type (e.g., given p: Pair and q: Pair, to swap p.m and q.m). I'm not yet sure how to prevent this from happening, but it's a pretty small hole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions