Skip to content

Minimum guarantees regarding UnsafeCell #495

Closed

Description

As part of google/zerocopy#251, we're trying to teach zerocopy to reason in more precise ways about UnsafeCells. I'm making this issue as sort of an omnibus issue to clarify a number of questions regarding UnsafeCells. As always with zerocopy, I'm only interested in what can currently be promised, not what might be promised in the future.

In particular, my questions concern when it's sound for different code to disagree on whether a given memory region is covered by an UnsafeCell.

Constraints

Based on Miri's behavior and APIs that already exist in the standard library, we can observe some constraints on what code is sound/unsound. In particular:

  • Any time Miri treats code as unsound, I take that to mean that we cannot guarantee that it is sound even if it might be decided to be sound at some point in the future
  • Just because Miri treats code as sound, it may not be guaranteed to be sound
  • If a stable API exists which implies that code is sound, then it probably is guaranteed to be sound

It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an UnsafeCell

Status: can't assume sound

Prior art: #455, #303

As soon as there are multiple references which are "active"* or "not shadowed" or whatever the proper terminology is, and these references disagree about whether a given memory region is covered by an UnsafeCell, this is insta-UB according to stacked borrows even if the references are never used for anything. For example:

let u = 1usize;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) }; // Insta-UB

*Previously, this post used the term "live", but @CAD97 pointed out that "active" is the more accurate term.

Miri output
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
 --> src/main.rs:6:27
  |
6 |     let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |                           |
  |                           trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
  |                           this error occurs as part of retag at alloc809[0x0..0x8]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
 --> src/main.rs:6:30
  |
6 |     let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
  |                              ^^^^^
  = note: BACKTRACE (of the first span):
  = note: inside `main` at src/main.rs:6:27: 6:74

It's still unsound even if the overlap is only "partial"

Status: can't assume sound

In this example, only some of the bytes go from covered by an UnsafeCell to not covered.

#[repr(C)]
struct Foo(Cell<u32>, u32);

let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) }; // Insta-UB
Miri output
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
 --> src/main.rs:9:27
  |
9 |     let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |                           |
  |                           trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
  |                           this error occurs as part of retag at alloc809[0x0..0x8]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
 --> src/main.rs:9:30
  |
9 |     let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
  |                              ^^^^^
  = note: BACKTRACE (of the first span):
  = note: inside `main` at src/main.rs:9:27: 9:64

It's no longer unsound if the UnsafeCell region is of zero size

Status: might be able to guarantee sound

In other words, it's not the mere presence of UnsafeCell that matters, but rather the bytes that are "covered" by UnsafeCells. For this reason, ZST UnsafeCells have no effect; this code is accepted by Miri:

// `Cell<()>` is a ZST
#[repr(C)]
struct Foo(Cell<()>, u64);

let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };

It is sound to have multiple references disagree about what type a memory region has so long as those references agree on UnsafeCells

Status: probably guaranteed sound

For example, in both of the following cases, while the types are not the same, the byte ranges covered by UnsafeCells are the same.

let u = 1usize;
let u_ref = &u;
// Convert from `&usize` to `&isize`. Neither type has `UnsafeCell`s.
let _i_ref = unsafe { &*(u_ref as *const usize as *const isize) };
    
let cu = Cell::new(1usize);
let cu_ref = &cu;
// Convert from `&Cell<usize>` to `&Cell<isize>`. Both types have
// `UnsafeCell`s covering the same byte ranges.
let _ci_ref = unsafe { &*(cu_ref as *const Cell<usize> as *const Cell<isize>) };

It is sound to have multiple references disagree about UnsafeCells so long as only one of them is "active"

Status: guaranteed sound

In this example, I believe that what's happening is that u_mut becomes inaccessible so long as _c_mut exists, so there's no way to exercise the UnsafeCell disagreement. Not only is this accepted by Miri in practice, it's also codified in the UnsafeCell methods get_mut (stable) and from_mut (unstable nightly feature).

let mut u = 1usize;
let u_mut = &mut u;
let _c_mut = unsafe { &mut *(u_mut as *mut usize as *mut Cell<usize>) };

It is sound to read a non-UnsafeCell location as an UnsafeCell via ptr::read

Status: unclear

I'm not sure how to think about why this is sound, and whether it's guaranteed to remain sound.

let u = 1usize;
let u_ref = &u;
let _c = unsafe { ptr::read(u_ref as *const usize as *const Cell<usize>) };

It is sound to read an UnsafeCell location as a non-UnsafeCell via ptr::read

Status: unclear

I'm even less sure how to think about this one - my intuition tells me that Miri should reject this, but it doesn't.

let c = Cell::new(1usize);
let c_ref = &c;
let _u = unsafe { ptr::read(c_ref as *const Cell<usize> as *const usize) };

Guarantees

I would like to be able to rely on the following guarantees. For each guarantee, my questions are:

  • Is this already documented somewhere as being guaranteed?
  • If not, would it be acceptable (at least to T-opsem) to add documentation to guarantee it?

Sound given UnsafeCell agreement

In particular: it is sound to create references to the same memory which disagree about type so long as they agree about which bytes are covered by UnsafeCell. It is sound to use such references to perform loads and stores (assuming those loads and stores do not violate the referent types' bit validity).

These analyses only consider active references

For the purposes of the previous soundness guarantee, only "active" (i.e., non-shadowed) references are considered.* This allows us to prove that code which reborrows a &mut T (with no UnsafeCells) as a &mut U (with UnsafeCells) or vice versa (e.g., UnsafeCell::get_mut) is sound.

I am aware that the formal definition of "active-ness" is up in the air, and subject to models like stacked borrows and tree borrows. I am assuming that there is some way we can formally define borrows so that we can get some minimum guarantees about this section despite not having completely agreed on the full model.

It is sound to read a non-UnsafeCell location as an UnsafeCell via ptr::read

Given &T where T doesn't contain any UnsafeCells, it is sound to convert &T to *const U (where U does contain UnsafeCells) and use ptr::read to read a U out of that memory location (assuming that this doesn't violate U's bit validity, read uninitialized bytes, read unaligned memory, etc etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions