What memory is the Global allocator allowed to access

Consider the following code:

```rust
#[no_mangle]
pub fn src(mut x: Vec<&mut u8>) -> u8 {
    let mut y = 0;
    {
    let mut x = std::mem::take(&mut x);
    // needed for the optimization otherwise LLVM gets confused with the potential reallocation
    unsafe { std::hint::assert_unchecked(x.len() < x.capacity()); }
    x.push(&mut y);
    // x gets deallocated here and `GlobalAlloc` can potentially use the pointer to `y`
    }
    y
}
```

Right now `src()` gets optimized to return `0` unconditionally after freeing the memory backing `x`. Furthermore, the write of the mutable reference gets optimized out, since the compiler assumes that deallocator doesn't access the freed memory block w/o first overwriting it.

I believe you could justify these semantics if you specify that `alloc::dealloc` overwrites the underlying buffer with undef/poison before providing it to the global allocator, however this should probably be documented in the [docs](https://doc.rust-lang.org/stable/std/alloc/trait.GlobalAlloc.html#safety) (unless it's "obvious" that the Global allocator can't do that, since that's probably an expectation that most people would have).


Furthermore, consider the example from https://github.com/rust-lang/rust/issues/130853. I'm not sure how to justify the original transformation w/ Stacked or Tree borrows, however, I was wondering if the following is justifiable:

```rust
// use a mutable reference to prevent the MIR opt from happening
#[no_mangle]
pub fn src(x: &mut &u8) -> impl Sized {
    let y = **x;
    let mut z = Box::new(0);
    // a bunch of code that operates on the `Box`, however, 
    // nothing else can potentially access the underlying `u8`
    // that's behind the double reference besides the `__rust_alloc` call.
    

    // optimizable to `true`?
    **x == y
}
```

Currently, LLVM doesn't do the second optimization. However, it does perform it if you manually set `System` to be the global allocator: https://rust.godbolt.org/z/a77PWjeKE [^1]. This is due to this [line](https://github.com/llvm/llvm-project/blob/c029702f82a494053b23f10886fdc319751cd193/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1005), which is used by their GVN pass.

TLDR: is the implementor of the global allocator required to not modify references that are "visible"[^2] in code that invokes the global allocator methods? I realize that that definition is kinda scuffed, but that's how LLVM explains their assumptions of `malloc/calloc/free`:


> inaccessiblemem: This refers to accesses to memory which is not accessible by the current module (before return from the function – an allocator function may return newly accessible memory while only accessing inaccessible memory itself). Inaccessible memory is often used to model control dependencies of intrinsics.

> The default access kind (specified without a location prefix) applies to all locations that haven’t been specified explicitly, including those that don’t currently have a dedicated location kind (e.g. accesses to globals or captured pointers).

[^1]: You also get the `malloc` -> `calloc` transformation for types other than these [hardcoded](https://github.com/rust-lang/rust/blob/14f303bc1430a78ddaa91b3e104bbe4c0413184e/library/alloc/src/vec/is_zero.rs) ones if you set `System` to be the global allocator manually.

[^2]: This probably means that allocator methods aren't allowed to be inlined if the optimizer wants to make assumptions about code that invokes them.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What memory is the Global allocator allowed to access #534

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What memory is the Global allocator allowed to access #534

Description

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions