Skip to content

Miscompilation of SIMD when crossing target_feature boundaries #55059

Closed
@raphlinus

Description

@raphlinus

This is a reduced example of a problem I've run into trying to make safe SIMD wrappers. The idea here is to have a newtype that can only be constructed when the capability is dynamically detected. However, the compiler seems to get confused about calling conventions when calling into code with target_feature enabled from code that doesn't.

#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::*;

#[target_feature(enable = "avx")]
unsafe fn avx_mul(a: __m256, b: __m256) -> __m256 {
    _mm256_mul_ps(a, b)
}

#[target_feature(enable = "avx")]
unsafe fn avx_store(p: *mut f32, a: __m256) {
    _mm256_storeu_ps(p, a)
}

#[target_feature(enable = "avx")]
unsafe fn avx_setr(a: f32, b: f32, c: f32, d: f32, e: f32, f: f32, g: f32, h: f32) -> __m256 {
    _mm256_setr_ps(a, b, c, d, e, f, g, h)
}

#[target_feature(enable = "avx")]
unsafe fn avx_set1(a: f32) -> __m256 {
    _mm256_set1_ps(a)
}

struct Avx(__m256);

fn mul(a: Avx, b: Avx) -> Avx {
    unsafe { Avx(avx_mul(a.0, b.0)) }
}

fn set1(a: f32) -> Avx {
    unsafe { Avx(avx_set1(a)) }
}

fn setr(a: f32, b: f32, c: f32, d: f32, e: f32, f: f32, g: f32, h: f32) -> Avx {
    unsafe { Avx(avx_setr(a, b, c, d, e, f, g, h)) }
}

unsafe fn store(p: *mut f32, a: Avx) {
    avx_store(p, a.0);
}

pub fn main() {
    let mut result = [0.0f32; 8];
    let a = mul(setr(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0), set1(0.25));
    unsafe { store(result.as_mut_ptr(), a)}
    println!("{:?}", result);
}

(Playground)

Output:

[0.0, 5.0, 12.0, 21.0, 0.0, 0.0, 0.0, 0.0]

Errors:

   Compiling playground v0.0.1 (file:///playground)
    Finished release [optimized] target(s) in 0.59s
     Running `target/release/playground`

In a debug build, the answer is [0.0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75] as expected. Notice that the first 3 values of the miscompiled version are [0, 1, 2, 3] * [4, 5, 6, 7], suggesting that the halves are getting scrambled (and this is confirmed by looking at the generated asm).

Also, this just crashes on Windows.

Same miscompilation happens if I move the Avx() newtype wrapper up into the top four functions.

It's possible I don't understand the rules for what's safe to do in SIMD. If that's the case, the limitations on passing values across function boundaries should be documented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions