Closed
Description
Consider the following input IR:
target triple = "x86_64-unknown-linux-gnu"
define void @test1() "target-features"="+avx" {
call void @test2()
ret void
}
define internal void @test2() {
call i64 @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
ret void
}
define i64 @test3(<4 x i64> %arg) noinline {
%v = extractelement <4 x i64> %arg, i64 2
ret i64 %v
}
Running this through opt -inline
produces:
define void @test1() #0 {
%1 = call i64 @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
ret void
}
define i64 @test3(<4 x i64> %arg) #1 {
%v = extractelement <4 x i64> %arg, i64 2
ret i64 %v
}
attributes #0 = { "target-features"="+avx" }
attributes #1 = { noinline }
This inlining is allowed, because test1 has a superset of target features of test2. However, what is now going to happen is that the backend will lower the test3() call using ymm registers (because the function has avx target feature), while the function will expect the arguments in xmm registers (because it does not have avx target feature).
Godbolt: https://llvm.godbolt.org/z/M95svT6qY
I previously started a discussion about this on the mailing list, but did not get much response: https://groups.google.com/g/llvm-dev/c/g_6THpxasjA
(Downstream reports of this miscompile are rust-lang/rust#79865 and rust-lang/rust#91839.)