Skip to content

JIT code optimization inconsistency #51587

Closed
@giladfrid009

Description

@giladfrid009

Description

Whenever we have a type switch in a method, and the type is known compile-time, then the wrong cases get jitted away.

Indeed this happens for the following implementations:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T AddSwitch(T left, T right)
{       
    return (left, right) switch
    {
        (byte L, byte R) => (T)(object)(byte)(L + R),
        (sbyte L, sbyte R) => (T)(object)(sbyte)(L + R),
        (ushort L, ushort R) => (T)(object)(ushort)(L + R),
        (short L, short R) => (T)(object)(short)(L + R),
        (uint L, uint R) => (T)(object)(uint)(L + R),
        (int L, int R) => (T)(object)(int)(L + R),
        (ulong L, ulong R) => (T)(object)(ulong)(L + R),
        (long L, long R) => (T)(object)(long)(L + R),
        (float L, float R) => (T)(object)(float)(L + R),
        (double L, double R) => (T)(object)(double)(L + R),
        _ => throw new NotSupportedException(typeof(T).Name)
    };
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T AddIfs(T left, T right)
{       
    if (typeof(T) == typeof(byte)) return (T)(object)(byte)((byte)(object)left + (byte)(object)right);
    if (typeof(T) == typeof(sbyte)) return (T)(object)(sbyte)((sbyte)(object)left + (sbyte)(object)right);
    if (typeof(T) == typeof(ushort)) return (T)(object)(ushort)((ushort)(object)left + (ushort)(object)right);
    if (typeof(T) == typeof(short)) return (T)(object)(short)((short)(object)left + (short)(object)right);
    if (typeof(T) == typeof(uint)) return (T)(object)((uint)(object)left + (uint)(object)right);
    if (typeof(T) == typeof(int)) return (T)(object)((int)(object)left + (int)(object)right);
    if (typeof(T) == typeof(ulong)) return (T)(object)((ulong)(object)left + (ulong)(object)right);
    if (typeof(T) == typeof(long)) return (T)(object)((long)(object)left + (long)(object)right);
    if (typeof(T) == typeof(float)) return (T)(object)((float)(object)left + (float)(object)right);
    if (typeof(T) == typeof(double)) return (T)(object)((double)(object)left + (double)(object)right);

    throw new NotSupportedException(typeof(T).Name);
}

JIT code:

SimpleSimd.NumOps`1[[System.Int32, System.Private.CoreLib]].AddSwitch(Int32, Int32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: lea eax, [ecx+edx]
    L0006: pop ebp
    L0007: ret

SimpleSimd.NumOps`1[[System.Int32, System.Private.CoreLib]].AddIfs(Int32, Int32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: lea eax, [ecx+edx]
    L0006: pop ebp
    L0007: ret
  • The problem is, that while AddIfs properly gets inlined, AddSwitch does not.

Version using AddIfs:

SimpleSimd.SimdOps`1[[System.Int32, System.Private.CoreLib]].Sum(System.ReadOnlySpan`1<Int32> ByRef)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push edi
    L0004: push esi
    L0005: vzeroupper
    L0008: mov eax, [ecx]
    L000a: mov edx, [ecx+4]
    L000d: xor ecx, ecx
    L000f: vxorps ymm0, ymm0, ymm0
    L0013: mov esi, edx
    L0015: sar esi, 0x1f
    L0018: and esi, 7
    L001b: add esi, edx
    L001d: sar esi, 3
    L0020: test esi, esi
    L0022: jle short L0037
    L0024: mov edi, ecx
    L0026: shl edi, 5
    L0029: vmovupd ymm1, [eax+edi]
    L002e: vpaddd ymm0, ymm0, ymm1
    L0032: inc ecx
    L0033: cmp ecx, esi
    L0035: jl short L0024
    L0037: vpmulld ymm0, ymm0, [SimpleSimd.SimdOps`1[[System.Int32, System.Private.CoreLib]].Sum(System.ReadOnlySpan`1<Int32> ByRef)]
    L0040: vphaddd ymm0, ymm0, ymm0
    L0045: vphaddd ymm0, ymm0, ymm0
    L004a: vextractf128 xmm1, ymm0, 1
    L0050: vpaddd xmm0, xmm0, xmm1
    L0054: vmovd esi, xmm0
    L0058: shl ecx, 3
    L005b: cmp ecx, edx
    L005d: jge short L0069
    L005f: mov edi, [eax+ecx*4]
    L0062: add esi, edi
    L0064: inc ecx
    L0065: cmp ecx, edx
    L0067: jl short L005f
    L0069: mov eax, esi
    L006b: vzeroupper
    L006e: pop esi
    L006f: pop edi
    L0070: pop ebp
    L0071: ret

Version using AddSwitch:

SimpleSimd.SimdOps`1[[System.Int32, System.Private.CoreLib]].Sum(System.ReadOnlySpan`1<Int32> ByRef)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push edi
    L0004: push esi
    L0005: push ebx
    L0006: vzeroupper
    L0009: mov esi, ecx
    L000b: mov edi, [esi]
    L000d: mov ecx, [esi+4]
    L0010: xor ebx, ebx
    L0012: vxorps ymm0, ymm0, ymm0
    L0016: mov edx, ecx
    L0018: sar edx, 0x1f
    L001b: and edx, 7
    L001e: add edx, ecx
    L0020: sar edx, 3
    L0023: test edx, edx
    L0025: jle short L003a
    L0027: mov eax, ebx
    L0029: shl eax, 5
    L002c: vmovupd ymm1, [edi+eax]
    L0031: vpaddd ymm0, ymm0, ymm1
    L0035: inc ebx
    L0036: cmp ebx, edx
    L0038: jl short L0027
    L003a: vpmulld ymm0, ymm0, [SimpleSimd.SimdOps`1[[System.Int32, System.Private.CoreLib]].Sum(System.ReadOnlySpan`1<Int32> ByRef)]
    L0043: vphaddd ymm0, ymm0, ymm0
    L0048: vphaddd ymm0, ymm0, ymm0
    L004d: vextractf128 xmm1, ymm0, 1
    L0053: vpaddd xmm0, xmm0, xmm1
    L0057: vmovd eax, xmm0
    L005b: shl ebx, 3
    L005e: cmp ebx, ecx
    L0060: jge short L0073
    L0062: mov edx, [edi+ebx*4]
    L0065: mov ecx, eax
    L0067: call dword ptr [0x822ec34]
    L006d: inc ebx
    L006e: cmp ebx, [esi+4]
    L0071: jl short L0062
    L0073: vzeroupper
    L0076: pop ebx
    L0077: pop esi
    L0078: pop edi
    L0079: pop ebp
    L007a: ret

The complete example can be found in SharpLab in the following link: Click me

Expected behavior:

both get inlined.

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions