Skip to content

Array interface method devirtualization #62457

@AndyAyersMS

Description

@AndyAyersMS

The jit cannot devirtualize interface calls made through arrays, because this aspect of arrays has a special implementation in the runtime. I noticed this a while back but never wrote up anything. I was reminded of this again by #62266.

Consider a simple program like:

using System;
using System.Collections.Generic;

class X {
    public static void Main() {
        IList<int> list = new int[10];
        for (int i = 0; i < 100; i++) {
            foreach (var _ in list) {
                // nothing
            }
        }
    }
}

Here all types are known and we should be able to de-abstract this and come close to the equivalent for loop.

Currently none of the interface calls devirtualize or inline, the method allocates, and the try/finally can't be optimized away, despite the empty Dispose from the enumerator.

Fixing all this requires resolving a number of issues:

  1. resolveVirtualMethodHelper on the jit interface calls CanCastToInterface -- this bypasses special checks in CanCastTo that handle interface methods for arrays.
  2. Once that is fixed, the method that gets devirtualized to is SZArrayHelper.GetEnumerator<T> which is a generic method, not an instance method on a generic class. So the context handling in resolveVirtualMethodHelper needs to be updated here and also over in impDevirtualizeMethod -- currently we always assume class context.
  3. Similar changes are needed in crossgen2 (haven't looked into this yet)
  4. Once that is fixed, the jit can devirtualize the GetEnumerator call, but this happens "too late" to devirtualize the subsequent MoveNext and get_Current calls through the enumerator. The fix here is likely to mark these GetEnumerator methods as intrinsics and extend getSpecialIntrinsicExactReturnType to propagate the right type downstream during importation.
  5. The GetEnumerator call is too big to inline by default. Likely we can just mark this with AggressiveInlining. But pragmatically we might want to boost inlining for methods that return a type (especially a sealed type) that is more derived that the declared return type. Doing so likely requires resolving more tokens during the pre-inline scan, but perhaps we're already doing this for newobj and it might not be too costly.
  6. If we inline (and haven't fixed (4) yet) then we can late devirtualize the MoveNext and Dispose as we see the improved type flowing out of the inlined method -- but not the call to get_Current -- not sure why just yet.
  7. The enumerator is a class, not a struct; if we made it a struct we might have a shot at removing the boxing (rather than proving we can stack allocate a class). But this box would have multiple uses, so we'd also need to tackle JIT: optimizations for multi-use boxes #9118.
  8. It remains to be seen if we actually do inline all those methods and undo the boxing whether we can then enable struct promotion and remove the overhead of iterating via an enumerator.

I will point out that the "equivalent" direct code (with loop non-empty to avoid it being optimized away entirely)

    public static int M_enum() 
    {
        int[] a = new int[1000];
        int result = 0;
        foreach (var v in a)
        {
           result ^= v;
        }
        return result;
    } 

is marginally less efficient than the for loop version:

    public static int M() 
    {
        int[] a = new int[1000];
        int result = 0;
        for (int i = 0; i < a.Length; i++)
        {
           result ^= a[i]; 
        }
        return result;
    }

In the more general case where we don't know the exact collection class, we might hope PGO would get us to an equivalent place. But here there are additional challenges.

  1. The enumerator GDV doesn't allow us to know the enumerator type in the subsequent loop. We ought to notice a correlation between the type of the enumerator returned and the class type seen in the loop interface calls (or the fact that the latter are invariants) and decide to clone the loop based on a type test. That would remove the GDV tests from within the loop.
  2. Ideally then we see that the loop cloning check is partially redundant based on the outcome of the GetEnumerator GDV test and simplify that path further. So, in the "good" case we do one type test to see what kind of collection we have, and then branch to a specialized loop that knows exactly what kind of enumerator we have.

This is perhaps the simplest case of de-abstraction. Other collection types have more complex enumerators.

category:cq
theme:devirtualization
skill-level:expert
cost:small
impact:medium

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

Status

Done

Relationships

None yet

Development

No branches or pull requests

Issue actions