Safe memory zeroing

(note: I use the terms _safe_ and _valid_ below with the precise meanings specified here: https://github.com/rust-rfcs/unsafe-code-guidelines/blob/master/reference/src/glossary.md#validity-and-safety-invariant)

# Motivation

See https://github.com/rust-lang/rust/issues/53491. In a nutshell: `std::mem::zeroed()` is dangerous - running this on miri ([playground]()):

```rust
fn main() {
    let _x: &'static str = unsafe { 
        std::mem::zeroed() 
    };  // Probably instant UB
}
```

produces an error of the form: "type validation failed: encountered NULL reference". Obviously, in the real world, code like this will be caught in code review, but catching this stops being easy when one has a struct with multiple fields and one has to manually verify that all fields in the struct are valid when all its bits are zero. Layer a couple of user-defined types on top of each other, and a small private change to one of them down the stack can easily make code using `mem::zeroed` have instant undefined behavior. 

When this happens, right now, having a test suite and running it on miri is the only way we have to detect that. However, C FFI is one of the main usages of `mem::zeroed` and `miri` has very limited support for that. So even if you have a good test suite, `miri` won't help you here.

This RFC provides a solution that catches these errors at compile-time, allowing type users to zero-initialize types using safe Rust code reliably and allowing type authors to specify that doing this is a part of the type's API that they are committed to support (where changing this would be an API breaking change).

# User-level explanation

## Alternative: Zeroed trait like Default

(note: the trait name `Zeroed` is yet to be bikeshedded - I think it would be better to agree on the approach and the semantics, and when that has consensus, we can bikeshed the name at the end).

(note: I think I prefer the marker trait + const fn approach explained below)

Add a `std::zeroed::Zeroed` trait, similar to `Default`, that denotes that the zero bit-pattern is a _valid_ bit-pattern of the type and that this bit-pattern is _safe_ to use. This trait is `unsafe` to implement - implementing it for `&T` would make `Zeroed::zeroed` have undefined behavior. 

```rust
// in libcore:
mod zeroed {
    /// A trait for types whose all-zeros bit-pattern is valid and safe.
    pub unsafe trait Zeroed {
        /// Instantiates a value with all bytes equal to zero.
        fn zeroed() -> Self where Self: Sized {
            unsafe { mem::MaybeUninit::<Self>::zeroed().into_inner() }
        }
    }
}
```

Implement `Zeroed` in `core` for all libcore types for which this is the case: integers, raw pointers, etc. - do not implement it for references, `NonZero{...}`, etc.

Add a custom-derive `Zeroed` that can be used to manually derive this trait for user-defined types without using `unsafe` Rust (e.g. if all the fields of a struct implement `Zeroed`). If the struct cannot derive `Zeroed` that should produce a compile-time error. Whether the all-zeros bit-pattern is valid and safe bit-pattern for a type is an API contract from the writer of the type to its users. This is why manually specifying it instead of using an auto trait feels like a better solution to the problem.

```rust
/// A type that is valid to zero-initialize, 
/// but not safe - this type does not derive Zeroed
struct Foo(u32);
impl Foo {
    pub fn new() -> Self { Self(1) }
    pub fn foo(&self) -> NonZeroU32 {
        // If this type was Zeroed, safe Rust code could
        // invoke undefined behavior
        unsafe { NonZeroU32::new_unchecked(self.0) }
    }
}

/// A type that is valid and safe to zero-initialize
#[derive(Zeroed)]
struct Bar(u32);
impl Bar {
    // bar is unsafe because the type can be zeroed
    // (Safety: call me only if self.0 != 0)
    pub unsafe fn bar(&self) -> NonZeroU32 {
        unsafe { NonZeroU32::new_unchecked(self.0) }
    }
    pub fn bar2(&self) -> NonZeroU32 {
        NonZeroU32::new(self.0) // panics if self.0 == 0
    }
}

/// This produces a compilation error, since this type is not valid to zero initialize
#[derive(Zeroed)]
struct Baz(u32, &'static str);
// ERROR: self.0 is not Zeroed
```

To upgrade code that previously was using `mem::zeroed()` to `Zeroed`, one changes:

```rust
let x: Foo = unsafe { mem::zeroed() };
```

to

```rust
// potentially adding a: use std::zeroed::Zeroed;
let x = Foo::zeroed();
```

We should probably add `Zeroed` to the `std::prelude::v1`.

After this change we can deprecate `std::mem::zeroed` with a deprecation warning "use `std::zeroed::Zeroed` instead". 

An RFC for this feature would probably leave this as an unresolved question, but we probably should turn that deprecation message into an error in the next edition. That is, for crates using `edition = rust2021` using `std::mem::zeroed` should error with the deprecation message instead. That is, `libcore` will contain `mem::zeroed` forever, so that Rust code using older editions can still use it, but we probably want to add a mechanism to ban using it from code that decides to use a newer edition.

# Alternatives

## auto trait

We could make `Zeroed` an `auto`-trait, but then I don't see how it could be used to denote that a zeroed value is _safe_ to use - we could still use it to denote that the value is _valid_. This has two consequences:

* `zeroinit` would need to be `unsafe`, since the resulting value might not be safe to use. That is, just because the zero bit pattern does not cause undefined behavior instantaneously does not imply that safe methods on the type might not all have a pre-condition that the bit-pattern is not all zeros. The user of the type might not want to provide a way to safely construct a value with such a bit-pattern, and it would be bad for the user to have to opt-out this `auto` trait to maintain _safety_. 

* Being able to zero-initialize a type is something that users of the type should be able to rely on. Once the type author commits to providing this API, it should be at least automatically noticeable when a change in the type breaks this API. With the non-auto trait + derive this happens automatically. With an `auto`-trait type authors would need to add a test for this (e.g. `fn foo() -> T { Zeroed::zeroed() }` or similar).

## marker trait

We could also make `Zeroed` a marker trait, and have some function like: 

```rust
const fn zeroed2<T: marker::Zeroed>() -> T {
    unsafe { mem::MaybeUninit::<T>::zeroed().into_inner() }
}
```

this approach has the advantage that `zeroed2` is a `const fn`. The disadvantage is that we can't call this function `mem::zeroed` because such a function already exist, and we'd have to either put it somewhere else, or call it somewhere else (e.g. `mem::zeroinit`? ). 

There is an RFC (https://github.com/rust-rfcs/const-eval/pull/8) that would solve this problem by allowing us to: 

```rust
pub trait Zeroed {
    #[default_method_body_is_const]
    fn zeroed() -> Self where Self: Sized { ... }
}
```

to indicate that the default method impl is const, and then allowing the Zeroed derive to perform a:

```rust
impl const Zeroed for $id { ... }
```

to add a const impl. However, the `Zeroed::zeroed` trait+trait method approach gives users the flexibility of adding their own `zeroed` implementations, and this is not a flexibility that I do think that we want. Paying for this flexibility might not be a good idea.

The simplicity of an unsafe to implement marker trait + a `const fn` somewhere in libcore is definitely appealing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Safe memory zeroing #2626

Motivation

User-level explanation

Alternative: Zeroed trait like Default

Alternatives

auto trait

marker trait

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Safe memory zeroing #2626

Description

Motivation

User-level explanation

Alternative: Zeroed trait like Default

Alternatives

auto trait

marker trait

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions