Open
Description
For a test:
[MethodImpl(MethodImplOptions.NoInlining)]
public static Vector128Wrapped TestReturnVector128WrappedPromoted()
{
var a = new Vector128Wrapped
{
f = Vector128<short>.Zero
};
return a;
}
arm64 native will use:
N001 ( 3, 2) [000015] ------------ t15 = LCL_FLD simd16 V00 loc0 [+0]
* simd16 V00.f (offs=0x00) -> V03 tmp1
/--* t15 simd16
N002 ( 4, 3) [000016] ------------ * RETURN simd16 $1c1
and generate
IN0003: 000018 ldr q16, [fp,#16] // [V00 loc0]
IN0004: 00001C mov v0.16b, v16.16b
when altjit will use:
N001 ( 3, 2) [000015] -c-----N---- t15 = LCL_VAR struct<Vector128Wrapped, 16>(P) V00 loc0
* simd16 V00.f (offs=0x00) -> V03 tmp1 $1c0
/--* t15 struct
N002 ( 4, 3) [000016] ------------ * RETURN struct $1c1
and generate:
IN0003: 000018 ldr x0, [fp,#16] // [V00 loc0]
IN0004: 00001C ldr x1, [fp,#24] // [V00 loc0+0x08]
That was found in #37745.
The issue is probably in this VM code:
runtime/src/coreclr/src/vm/class.cpp
Lines 1655 to 1668 in 255eea0
that we run on native arm64, but have disabled for altjit.
The issue prevents us from an optimization that would fail with altjit with the current implementation.
category:correctness
theme:altjit
skill-level:intermediate
cost:medium