Skip to content

There is no documented way to copy a pointer byte-for-byte in LLVM IR #142141

Open
@RalfJung

Description

@RalfJung

In Rust, it is possible to copy a pointer byte-for-byte:

let x = 0;
let ptr1 = &x as *const i32;
let mut ptr2 = std::ptr::null::<i32>();

for i in 0..std::mem::size_of::<*const i32>() {
    let ptr1_byte = (&raw const ptr1).cast::<MaybeUninit<u8>>().wrapping_add(i);
    let ptr2_byte = (&raw mut ptr2).cast::<MaybeUninit<u8>>().wrapping_add(i);
    unsafe { *ptr2_byte = *ptr1_byte }; // critical assignment
}

assert_eq!(unsafe { *ptr2 }, x);

However, to my knowledge currently we have no way to compile this Rust code into LLVM IR that is documented to work properly. What we currently do is to compile the critical assignment to a load + store at type i8, and we rely on this preserving the provenance of the bytes we are copying that way. This might seem "intuitively fine", but to my knowledge Alive2 actually says that this code has UB, so we clearly have a problem here.

This is a significant gap in the expressiveness of LLVM: as a low-level IR, it should be able to express the semantics of memcpy using regular "user-land code", i.e. without using the builtin. It's also a problem for Rust programmers writing low-level code as they have to resort to solutions that are outside of what is documented in the LangRef and that are considered UB by widely-used LLVM tooling.

I am aware of two solutions that have been previously proposed for this:

I'm aware that this is a hard problem, but it seems worth tracking it with a number at least. :)
Cc @nunoplopes @nikic

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions