Description
As I rub against the boundaries of unsafe and undefined behaviour more and more It's becoming less "obvious" to me what is or isn't allowed. To demonstrate this I've whipped up a few examples of safe and unsafe code that does basically the same thing. Some of them are clearly defined or clearly undefined in my mental model, but I really have no idea at this point.
a
is clearly defined, as it uses no unsafe code. We re-loan a mutable reference to a different variable temporarily. At any given point there's clear ownership of the value.
fn a() -> u32 {
// totally safe mutable aliasing
let a = &mut 1u32;
{
let b = &mut *a;
*b += 1;
}
*a += 1;
*a
}
b
does the exact same thing, but through a *mut
instead of an &mut
. Ownership is still "clear" to the compiler, but we mutate through the *mut
, and then later through the &mut
. It wouldn't be unreasonable to consider this undefined behaviour. We mutated something "owned" by an &mut
through something other than that &mut
, and then worked with the value through the &mut
.
fn b() -> u32 {
// same mutable aliasing, but with a raw ptr
let a = &mut 1u32;
unsafe {
let b = a as *mut _;
*b += 1;
}
*a += 1;
*a
}
c
does the exact same things as b
, but explicitly constructs an &'static mut to mutate through unsafely. Here we have created two &mut's to the same value, which in my mental mode is clearly invoking undefined behaviour as I understand it. You cannot have two &mut's to the same value.
fn c() -> u32 {
// same unsafe aliasing, but by explicitly making an &mut from a *mut
let a = &mut 1u32;
unsafe {
let b = &mut *((&mut *a) as *mut _);
*b += 1;
}
*a += 1;
*a
}
d
is basically the same as a
, but we've added a box in the way. This adds a rawptr between the &mut
and the actual data. Semantically the data is still "owned" by the &mut
, but is the *mut
in-between important or just an implementation detail? Regardless this is all safe, so this must be defined.
fn d() -> u32 {
// totally safe mutable aliasing, but through a box
// (and therefore a raw ptr)
let mut a = box 1u32;
{
let b = &mut *a;
*b += 1;
}
*a += 1;
*a
}
e
is the same as d
, but we've added a *mut
again. This time we mutate the data inside the box while the box is owned. However the box is a *mut
, so really we've just mutated data behind a rawptr with another rawptr. Now it really matters if the box's representation is defined or not! Critically I believe that the defined-ness of this case effects whether DList is sound or not. It mixes boxes, *mut
s, and &mut
s pretty freely. What is or isn't allowed is important.
fn e() -> u32 {
// same mutable aliasing, but with another raw ptr
let mut a = box 1u32;
unsafe {
let b = (&mut *a) as *mut _;
*b += 1;
}
*a += 1;
*a
}
Finally f
is a special case of unsafe mutable aliasing. Here we construct a *mut
to a subfield of a composite structure. Then we capture the whole structure with an &mut
. We mutate the subfield while the whole struct is owned, but then only use the ownership to mutate a different subfield. At no point do we read the "unsafely" mutated field. Then we relinquish ownership to the "parent" owner which, presumably, must assume that all fields may have been mutated since it loaned the structure out. Is this defined behaviour? I honestly have no clue.
fn f() -> u32 {
// mutable aliasing with a raw ptr, but on an unaccessed field
// until ownership is "returned" to the parent
let mut x = (1u32, 10u32);
let b = (&mut x.1) as *mut _;
{
let a = &mut x;
unsafe {
*b += 1;
}
a.0 += 1;
}
x.0 + x.1
}