Skip to content

What memory is the Global allocator allowed to access #534

Closed
@jwong101

Description

@jwong101

Consider the following code:

#[no_mangle]
pub fn src(mut x: Vec<&mut u8>) -> u8 {
    let mut y = 0;
    {
    let mut x = std::mem::take(&mut x);
    // needed for the optimization otherwise LLVM gets confused with the potential reallocation
    unsafe { std::hint::assert_unchecked(x.len() < x.capacity()); }
    x.push(&mut y);
    // x gets deallocated here and `GlobalAlloc` can potentially use the pointer to `y`
    }
    y
}

Right now src() gets optimized to return 0 unconditionally after freeing the memory backing x. Furthermore, the write of the mutable reference gets optimized out, since the compiler assumes that deallocator doesn't access the freed memory block w/o first overwriting it.

I believe you could justify these semantics if you specify that alloc::dealloc overwrites the underlying buffer with undef/poison before providing it to the global allocator, however this should probably be documented in the docs (unless it's "obvious" that the Global allocator can't do that, since that's probably an expectation that most people would have).

Furthermore, consider the example from rust-lang/rust#130853. I'm not sure how to justify the original transformation w/ Stacked or Tree borrows, however, I was wondering if the following is justifiable:

// use a mutable reference to prevent the MIR opt from happening
#[no_mangle]
pub fn src(x: &mut &u8) -> impl Sized {
    let y = **x;
    let mut z = Box::new(0);
    // a bunch of code that operates on the `Box`, however, 
    // nothing else can potentially access the underlying `u8`
    // that's behind the double reference besides the `__rust_alloc` call.
    

    // optimizable to `true`?
    **x == y
}

Currently, LLVM doesn't do the second optimization. However, it does perform it if you manually set System to be the global allocator: https://rust.godbolt.org/z/a77PWjeKE 1. This is due to this line, which is used by their GVN pass.

TLDR: is the implementor of the global allocator required to not modify references that are "visible"2 in code that invokes the global allocator methods? I realize that that definition is kinda scuffed, but that's how LLVM explains their assumptions of malloc/calloc/free:

inaccessiblemem: This refers to accesses to memory which is not accessible by the current module (before return from the function – an allocator function may return newly accessible memory while only accessing inaccessible memory itself). Inaccessible memory is often used to model control dependencies of intrinsics.

The default access kind (specified without a location prefix) applies to all locations that haven’t been specified explicitly, including those that don’t currently have a dedicated location kind (e.g. accesses to globals or captured pointers).

Footnotes

  1. You also get the malloc -> calloc transformation for types other than these hardcoded ones if you set System to be the global allocator manually.

  2. This probably means that allocator methods aren't allowed to be inlined if the optimizer wants to make assumptions about code that invokes them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions