Skip to content

JIT: optimizations for multi-use boxes #9118

Closed
@AndyAyersMS

Description

@AndyAyersMS

A fairly common pattern (especially after inlining) is to see a box that feeds an isinst and if that succeeds, an unbox.any. For example:

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

internal class ObjectEqualityComparer<T> : EqualityComparer<T>
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public override bool Equals(T x, T y)
    {
        if (x != null)
        {
            if (y != null) return x.Equals(y);
            return false;
        }
        if (y != null) return false;
        return true;
    }
    
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public override int GetHashCode(T obj) => obj?.GetHashCode() ?? 0;
    
    // Equals method for the comparer itself.
    public override bool Equals(Object obj) =>
    obj != null && GetType() == obj.GetType();
    
    public override int GetHashCode() =>
    GetType().GetHashCode();
}

class C
{
    public static int Main()
    {
        var comp = new ObjectEqualityComparer<int>();
        bool result = comp.Equals(3, 4);
        return result ? 0 : 100;
    }
}

We get pretty far when optimizing Main here -- we can devirtualize the call to Equals, inline it and remove the null checks since we have a value type, then inline the inner call to Equals. But along the way we have to box y and the inner Equals has the following IL:

IL_0000  03                ldarg.1     
IL_0001  75 f1 00 00 02    isinst       0x20000F1
IL_0006  2d 02             brtrue.s     2 (IL_000a)
IL_0008  16                ldc.i4.0    
IL_0009  2a                ret         
IL_000a  02                ldarg.0     
IL_000b  4a                ldind.i4    
IL_000c  03                ldarg.1     
IL_000d  a5 f1 00 00 02    unbox.any    0x20000F1
IL_0012  fe 01             ceq         
IL_0014  2a                ret         

With the advent of dotnet/coreclr#14420 the jit will now optimize away the isinst, but the box cleanup opts for unbox.any don't fire because there is usually a temp in the way, and so we generate the following code for Main:

G_M4930_IG01:
       57                   push     rdi
       56                   push     rsi
       4883EC28             sub      rsp, 40

G_M4930_IG02:
; ** BOX (y) **
       48B9086014E2FA7F0000 mov      rcx, 0x7FFAE2146008
       E8BB3C835F           call     CORINFO_HELP_NEWSFAST   
       C7400804000000       mov      dword ptr [rax+8], 4
       488BF0               mov      rsi, rax
       4885F6               test     rsi, rsi                 ; gratuitous null check ?
       7504                 jne      SHORT G_M4930_IG03
       33FF                 xor      edi, edi
       EB2D                 jmp      SHORT G_M4930_IG05

G_M4930_IG03:
; * UNBOX.ANY type check
       48BA086014E2FA7F0000 mov      rdx, 0x7FFAE2146008
       483916               cmp      qword ptr [rsi], rdx
       7412                 je       SHORT G_M4930_IG04
; * call helper if type check fails (which it won't)
       488BD6               mov      rdx, rsi
       48B9086014E2FA7F0000 mov      rcx, 0x7FFAE2146008
       E8470B395F           call     CORINFO_HELP_UNBOX

G_M4930_IG04:
       837E0803             cmp      dword ptr [rsi+8], 3
       400F94C7             sete     dil
       400FB6FF             movzx    rdi, dil

G_M4930_IG05:
       85FF                 test     edi, edi
       750C                 jne      SHORT G_M4930_IG07
       B864000000           mov      eax, 100

G_M4930_IG06:
       4883C428             add      rsp, 40
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret

G_M4930_IG07:
       33C0                 xor      eax, eax

G_M4930_IG08:
       4883C428             add      rsp, 40
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret

If when optimizing a successful cast we copy the result to a new more strongly typed temp (see #9117) we might be able to optimize away the type equality check in the downstream unbox.any (see dotnet/coreclr#14473). And perhaps if we are lucky and the box is simple we might be able to propagate the value to be boxed through the box/unbox to the ultimate use, and so not need the unbox. But the box would remain as it is difficult to remove unless it is known to be dead and whatever transformation makes it dead explicitly cleans it up.

A couple of ways we could approach this:

  • The optimizer should be able to reason about and propagate boxes and perhaps trigger the box/unbox.any peephole, and turn the result into a simple copy.
  • BOX is just an expression "wrapper" in the spirit of JIT: some ideas on high-level representation of runtime operations in IR #9056. So we could allow the inliner to give BOX(y) the same treatment as y and duplicate it within the inlinee body (essentially, generalize the logic in impInlineFetchArg that begins with else if (argInfo.argIsLclVar && !argCanBeModified) to also apply to BOX(y)). If we added suitable "reference counting" to boxes to track the duplicates then optimizing away the last use of the box could trigger the box cleanup. We have this today but the reference count is implicit and always = 1 since we don't duplicate the boxed values.

If all this kicked in, the code for Main above would collapse to simply returning a constant.

category:cq
theme:importer
skill-level:expert
cost:medium

Metadata

Metadata

Labels

Priority:2Work that is important, but not critical for the releasearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions