Skip to content

Specifying linkage on externs silently removes indirection #31508

Open
@jethrogb

Description

@jethrogb

When compiling the following code:

// externs-c.c
unsigned char myarr[10]={1,2,3,4,5,6,7,8,9,10};
unsigned char (*implicitvar)[10]=&myarr;
unsigned char (*explicitvar)[10]=&myarr;
// externs-rust.rs
#![feature(linkage)]

extern {
    static implicitvar: *const [u8;10];
    // Should have no effect, external linkage is the default in an extern block
    #[linkage="external"]
    static explicitvar: *const [u8;10];
}

fn as_option(p: *const [u8;10]) -> Option<&'static [u8;10]> {
    unsafe{std::mem::transmute(p)}
}

fn main() {
    println!("implicitvar = {:?}",as_option(implicitvar));
    println!("explicitvar = {:?}",as_option(explicitvar));
}

using

clang -c externs-c.c && rustc externs-rust.rs -C link-args=./externs.o

running ./externs will output something like the following:

implicitvar = Some([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
explicitvar = Some([168, 4, 122, 85, 85, 85, 0, 0, 0, 0])

Wat.

Taking a look at the IR:

; externs-c.ll
@myarr = global [10 x i8] c"\01\02\03\04\05\06\07\08\09\0A", align 1
@implicitvar = global [10 x i8]* @myarr, align 8
@explicitvar = global [10 x i8]* @myarr, align 8
; externs-rust.ll
@implicitvar = external global [10 x i8]*
@explicitvar = external global [10 x i8]
@_rust_extern_with_linkage_explicitvar = internal global [10 x i8]* @explicitvar

So, Rust removes a layer of indirection defining static explicitvar: [u8;10] and adding a new variable static _rust_extern_with_linkage_explicitvar: *const [u8;10]=&explicitvar. All mentions of explicitvar in Rust source code get replaced with _rust_extern_with_linkage_explicitvar. This results in the C version and this new Rust version not having the same type! To get “correct” behavior in the example above, you would need to define static explicitvar: *const *const [u8;10] instead.

This weird assymmetry between the types associated with symbols in Rust and in C is a source of great confusion and can easily lead to bugs. In the example above, we just read 2 bytes past some pointer by interpreting it as a 10-byte array.

This weird behavior was introduced in #12556 (see also #11978), the rationale being weak linkage and the fact that some pointers can't be null in the Rust typesystem. While true, I don't think that's sufficient rationale to add this layer of indirection. I think the layer of indirection should be removed completely. For weak linkage, a restriction can be added to allow only zeroable types.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationA-linkageArea: linking into static, shared libraries and binariesC-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.requires-nightlyThis issue requires a nightly compiler in some way.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions