Skip to content

UnsafeCell access APIs #521

Closed
Closed
@pitaj

Description

@pitaj

Proposal

Problem statement

In the wild, the following patterns are pretty common:

// Setting value inside `UnsafeCell`
unsafe { *my_unsafe_cell.get() = new_value; }
// Getting reference to value inside `UnsafeCell`
let value = unsafe { &*my_unsafe_cell.get(); }

These operation involve raw pointer dereferences. Many users of Rust prefer to avoid raw pointers if at all possible, do to the numerous invariants that must be upheld to avoid UB (non-null, aligned, plus initialized in the case of assignment).

If you already have &UnsafeCell, why should you be required to go through a raw pointer just to assign to or take a reference to the value within?

Motivating examples or use cases

Consider this example from once_cell, which splits the .get() from the assignment:

    /// Safety: synchronizes with store to value via SeqCst read from state,
    /// writes value only once because we never get to INCOMPLETE state after a
    /// successful write.
    #[cold]
    pub(crate) fn initialize<F, E>(&self, f: F) -> Result<(), E>
    where
        F: FnOnce() -> Result<T, E>,
    {
        let mut f = Some(f);
        let mut res: Result<(), E> = Ok(());
        let slot: *mut Option<T> = self.value.get();
        initialize_or_wait(
            &self.queue,
            Some(&mut || {
                let f = unsafe { f.take().unwrap_unchecked() };
                match f() {
                    Ok(value) => {
                        unsafe { *slot = Some(value) };
                        true
                    }
                    Err(err) => {
                        res = Err(err);
                        false
                    }
                }
            }),
        );
        res
    }

If you were to audit the lower unsafe block in this example:

unsafe { *slot = Some(value) };

The safety comment on the function itself handles the synchronization, but you would have to look at context to see that it is derived from the UnsafeCell, and therefore fulfills the following invariants:

  • non-null
  • aligned
  • value within is initialized

If safety comments were required, you might see a comment like this:

// SAFETY: Pointer is valid and old value is initialized, because it comes from the `UnsafeCell` above.

Which would be unnecessary if just an &UnsafeCell reference could be used instead.

Solution sketch

impl<T> UnsafeCell<T> {
    /// Replace the value in this `UnsafeCell` and return the old value.
    ///
    /// # SAFETY
    /// - Allowing calls to race with any other access
    ///   to the wrapped value is Undefined Behavior.
    /// - Calling this while any other references to the
    ///   wrapped value are alive is Undefined Behavior.
    unsafe fn replace(&self, new_value: T) -> T;
    /// Get a shared reference to the wrapped value.
    ///
    /// # SAFETY
    /// - Calling this while any mutable references to the
    ///   wrapped value are alive is Undefined Behavior.
    /// - Mutating the wrapped value while the returned
    ///   reference is alive is Undefined Behavior.
    unsafe fn get_ref(&self) -> &T;
}

The above added APIs allow us to rewrite the once_cell code like this:

    pub(crate) fn initialize<F, E>(&self, f: F) -> Result<(), E>
    where
        F: FnOnce() -> Result<T, E>,
    {
        let mut f = Some(f);
        let mut res: Result<(), E> = Ok(());
        let slot: &UnsafeCell<Option<T>> = &self.value; // Just a reference
        initialize_or_wait(
            &self.queue,
            Some(&mut || {
                let f = unsafe { f.take().unwrap_unchecked() };
                match f() {
                    Ok(value) => {
                        unsafe { slot.replace(Some(value)) }; // Just an unsafe function call
                        true
                    }
                    Err(err) => {
                        res = Err(err);
                        false
                    }
                }
            }),
        );
        res
    }

And more generally, users can use references instead of raw pointers in more places:

  • &UnsafeCell<T> does not currently have a pointer equivalent
  • &UnsafeCell<MaybeUninit<T>> roughly equivalent to NonNull<T>
  • Option<&UnsafeCell<MaybeUninit<T>>> roughly equivalent to *mut T

Alternatives

Do nothing and people continue to just use the raw pointer operations.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions