Closed
Description
With the example taken from #109379:
AggregateDelegate(Enumerable.Range(0, 100_000).ToArray(), (acc, v) => acc + v, 0);
[MethodImpl(MethodImplOptions.NoInlining)]
public static T AggregateDelegate<T>(T[] ar, Func<T, T, T> func, T seed)
{
for (int i = 0; i < ar.Length; i++)
{
seed = func(seed, ar[i]);
}
return seed;
}
Once the fix in #109407 is applied, we end up with the following x64 codegen for the inner loop in the good case:
G_M53109_IG04: ;; offset=0x002A
mov eax, dword ptr [rcx]
add r8d, eax
add rcx, 4
dec edx
jne SHORT G_M53109_IG04
which looks great. However, for arm64, we end up with the following codegen:
G_M53109_IG03: ;; offset=0x0024
cbz x19, G_M53109_IG07
ldr x3, [x19, #0x18]
movz x20, #0x828 // code for C+<>c:<Main>b__0_0(int,int):int:this
movk x20, #540 LSL #16
movk x20, #0x7FFA LSL #32
cmp x3, x20
bne G_M53109_IG07
add x21, x0, #16
mov x0, xzr
align [0 bytes for IG04]
;; size=36 bbWeight=0.99 PerfScore 7.95
G_M53109_IG04: ;; offset=0x0048
ldr w3, [x21, x0]
add x0, x0, #4
add w2, w2, w3
sub w1, w1, #1
cbnz w1, G_M53109_IG04
where we were unable to fully strength reduce. The problem is a multi-def CSE that was not put in SSA, which makes IV opts unable to reason about it.