Description
When compiling the following code:
// externs-c.c
unsigned char myarr[10]={1,2,3,4,5,6,7,8,9,10};
unsigned char (*implicitvar)[10]=&myarr;
unsigned char (*explicitvar)[10]=&myarr;
// externs-rust.rs
#![feature(linkage)]
extern {
static implicitvar: *const [u8;10];
// Should have no effect, external linkage is the default in an extern block
#[linkage="external"]
static explicitvar: *const [u8;10];
}
fn as_option(p: *const [u8;10]) -> Option<&'static [u8;10]> {
unsafe{std::mem::transmute(p)}
}
fn main() {
println!("implicitvar = {:?}",as_option(implicitvar));
println!("explicitvar = {:?}",as_option(explicitvar));
}
using
clang -c externs-c.c && rustc externs-rust.rs -C link-args=./externs.o
running ./externs
will output something like the following:
implicitvar = Some([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
explicitvar = Some([168, 4, 122, 85, 85, 85, 0, 0, 0, 0])
Wat.
Taking a look at the IR:
; externs-c.ll
@myarr = global [10 x i8] c"\01\02\03\04\05\06\07\08\09\0A", align 1
@implicitvar = global [10 x i8]* @myarr, align 8
@explicitvar = global [10 x i8]* @myarr, align 8
; externs-rust.ll
@implicitvar = external global [10 x i8]*
@explicitvar = external global [10 x i8]
@_rust_extern_with_linkage_explicitvar = internal global [10 x i8]* @explicitvar
So, Rust removes a layer of indirection defining static explicitvar: [u8;10]
and adding a new variable static _rust_extern_with_linkage_explicitvar: *const [u8;10]=&explicitvar
. All mentions of explicitvar
in Rust source code get replaced with _rust_extern_with_linkage_explicitvar
. This results in the C version and this new Rust version not having the same type! To get “correct” behavior in the example above, you would need to define static explicitvar: *const *const [u8;10]
instead.
This weird assymmetry between the types associated with symbols in Rust and in C is a source of great confusion and can easily lead to bugs. In the example above, we just read 2 bytes past some pointer by interpreting it as a 10-byte array.
This weird behavior was introduced in #12556 (see also #11978), the rationale being weak linkage and the fact that some pointers can't be null in the Rust typesystem. While true, I don't think that's sufficient rationale to add this layer of indirection. I think the layer of indirection should be removed completely. For weak linkage, a restriction can be added to allow only zeroable types.