Skip to content

Compiled executable fails to launch when built with AVX and LTO enabled #44056

Closed
@yvt

Description

@yvt

A generated executable occasionally fails to launch when built with the rustc options -Ctarget-feature=+avx -Copt-level=2 -Clto.

I tried this code:

fn main(){}

Compiled with the following shell script:

#!/bin/sh
rustc main.rs -Ctarget-feature=+avx -C opt-level=3 -Clto -g

When I ran the generated executable main repeatedly, the execution of the program stalled (did not terminate nor output anything; did not even enter the main function) 5 out of 100 times.

When I ran the executable from lldb, I could see that EXC_BAD_ACCESS had occured because it attempted to load a 32-byte block from an unaligned memory using vmovdqa (which requires the operand address to be 32-byte aligned).

- thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x0000000100000bf6 main`main + 518
main`main:
->  0x100000bf6 <+518>: vmovdqa (%rax), %ymm0
    0x100000bfa <+522>: movl   $0x1, %ecx
    0x100000bff <+527>: vmovq  %rcx, %xmm1
    0x100000c04 <+532>: vmovdqa %ymm1, (%rax)
(lldb) register read
General Purpose Registers:
       rax = 0x0000000100300470

Meta

rustc --version --verbose:

rustc 1.21.0-nightly (469a6f9bd 2017-08-22)
binary: rustc
commit-hash: 469a6f9bd9aef394c5cff6b3bc41b8c520f9515b
commit-date: 2017-08-22
host: x86_64-apple-darwin
release: 1.21.0-nightly
LLVM version: 4.0

The output of sample (a tool that comes with macOS) when the program is stalled:

Call graph:
    2721 Thread_15178881   DispatchQueue_1: com.apple.main-thread  (serial)
      2721 start  (in libdyld.dylib) + 1  [0x7fffa220d235]
        2721 0x0
          2721 _sigtramp  (in libsystem_platform.dylib) + 26  [0x7fffa241cb3a]
            2721 std::sys::imp::stack_overflow::imp::signal_handler  (in main) + 125  [0x105c58b7d]  mem.rs:609

Analysis

The offending instruction is supposedly a part of libcore::ptr::swap_nonoverlapping_bytes, which is called during the execution of libstd::thread::local::LocalKey::init, which is called when the runtime is being initialized.

#[inline]
unsafe fn swap_nonoverlapping_bytes(x: *mut u8, y: *mut u8, len: usize) {
    // <snip>
    #[cfg_attr(not(any(target_os = "emscripten", target_os = "redox",
                       target_endian = "big")),
               repr(simd))]
    struct Block(u64, u64, u64, u64);
    // <snip>
        // Swap a block of bytes of x & y, using t as a temporary buffer
        // This should be optimized into efficient SIMD operations where available
        copy_nonoverlapping(x, t, block_size); // <--- HERE
    // <snip>
}

After the optimization, this call to the intrinsic function copy_nonoverlapping is translated into the following LLVM instruction:

%t.0.copyload.i.i.i.i.i.i.i.i.i = load <4 x i64>, <4 x i64>* bitcast ({ { { i64, [32 x i8] } }, { { i1 } }, { { i1 } }, [6 x i8] }* @_ZN3std10sys_common11thread_info11THREAD_INFO7__getit5__KEY17h80e4cdc49b84860aE to <4 x i64>*), align 32, !dbg !3742, !noalias !3762

This is translated into the following x86_64 instruction:

vmovdqa (%rax), %ymm0

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationC-bugCategory: This is a bug.I-crashIssue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions