Description
On wasm32-unknown-emscripten and wasm32-wasi, rustc implements the C ABI for some unions incorrectly, i.e., different from Clang. Minimized example:
#[repr(C)]
pub union U {
a: u32,
b: u32,
}
#[no_mangle]
pub extern "C" fn unwrap_union(u: U) -> u32 {
unsafe { u.a }
}
#[no_mangle]
pub extern "C" fn make_union() -> U {
U { a: 0 }
}
I expected to see this happen: the resulting wasm code should pass and return the union indirectly, i.e. by pointers, as described in the C ABI document and implemented in Clang (compiler explorer).
Instead, this happened: the union is passed and returned as a single scalar (i32). See the previous compiler explorer link, and I also see it locally for wasm32-wasi (too lazy to install a whole emscripten toolchain):
$ rustc +nightly -O cabi-union.rs --crate-type=cdylib --target wasm32-wasi
$ wasm-tools print cabi_union.wasm
(module $cabi_union.wasm
(type (;0;) (func (param i32) (result i32)))
(type (;1;) (func (result i32)))
(type (;2;) (func))
(func $unwrap_union (;0;) (type 0) (param i32) (result i32)
local.get 0
)
(func $make_union (;1;) (type 1) (result i32)
i32.const 0
)
(func $dummy (;2;) (type 2))
(func $__wasm_call_dtors (;3;) (type 2)
call $dummy
call $dummy
)
(func $unwrap_union.command_export (;4;) (type 0) (param i32) (result i32)
local.get 0
call $unwrap_union
call $__wasm_call_dtors
)
(func $make_union.command_export (;5;) (type 1) (result i32)
call $make_union
call $__wasm_call_dtors
)
(table (;0;) 1 1 funcref)
(memory (;0;) 16)
(global $__stack_pointer (;0;) (mut i32) i32.const 1048576)
(export "memory" (memory 0))
(export "unwrap_union" (func $unwrap_union.command_export))
(export "make_union" (func $make_union.command_export))
(@producers
(language "Rust" "")
(processed-by "rustc" "1.78.0-nightly (2bf78d12d 2024-02-18)")
(processed-by "clang" "16.0.4 (https://github.com/llvm/llvm-project ae42196bc493ffe877a7e3dff8be32035dea4d07)")
)
)
The definition of "singleton" union in the C ABI document ("recursively contains just a single scalar value") may be considered ambiguous, but clearly Clang interprets it differently from rustc, so something will have to give. I have not tried to exhaustively explore in which cases they differ, the above example may not be the only one.
Compare and contrast #71871 - as discussed there, the emscripten and wasi targets have long since been fixed to match Clang's ABI, with only wasm32-unknown-unknown lagging behind. However, it seems that the fixed C ABI on emscripten and wasi targets is still incorrect in some cases around unions.
cc @curiousdannii, who encountered this in a real project (rust-lang/cc-rs#954)
Meta
rustc +nightly --version --verbose
:
rustc 1.78.0-nightly (2bf78d12d 2024-02-18)
binary: rustc
commit-hash: 2bf78d12d33ae02d10010309a0d85dd04e7cff72
commit-date: 2024-02-18
host: x86_64-unknown-linux-gnu
release: 1.78.0-nightly
LLVM version: 18.1.0
(Also happens on 1.76 stable.)