Skip to content

Promote (scalar replace) structs with more than 4 fields #6534

Closed
@AndreyAkinshin

Description

@AndreyAkinshin

Let's look at the following code (based on this StackOverflow question):

public struct Cards3
{
    public byte C0, C1, C2;
}

public struct Cards8
{
    public byte C0, C1, C2, C3, C4, C5, C6, C7;
}

class Program
{
    static void Main()
    {           
        Run3();
        Run8();
    }

    private static Cards3[] cards3 = new Cards3[1];
    private static Cards8[] cards8 = new Cards8[1];

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static int Run3()
    {
        var c = cards3[0];
        return c.C0 - c.C1;
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static int Run8()
    {
        var c = cards8[0];
        return c.C0 - c.C1;
    }
}

Now let's look at the asm code (Windows 10, .NET Framework 4.6.1 (4.0.30319.42000), clrjit-v4.6.1080.0):

; Run3                                                    
            var c = cards3[0];                            
00007FFEDF0A4752  in          al,dx                       
00007FFEDF0A4753  sub         byte ptr [rax-48h],cl       
00007FFEDF0A4756  add         byte ptr [rax],0B0h         
00007FFEDF0A4759  out         72h,al                      
00007FFEDF0A475B  add         al,byte ptr [rax]           
00007FFEDF0A475D  add         byte ptr [rax-75h],cl       
00007FFEDF0A4760  add         byte ptr [rbx+76000878h],al 
00007FFEDF0A4766  adc         al,48h                      
00007FFEDF0A4768  add         eax,10h                     
00007FFEDF0A476B  movzx       edx,byte ptr [rax]          ; !!!
00007FFEDF0A476E  movzx       eax,byte ptr [rax+1]        ; !!!
            return c.C0 - c.C1;                           
00007FFEDF0A4772  sub         edx,eax                     ; !!!
00007FFEDF0A4774  mov         eax,edx                     ; !!!
00007FFEDF0A4776  add         rsp,28h                     
00007FFEDF0A477A  ret                                     
00007FFEDF0A477B  call        00007FFF3EB57BE0            
00007FFEDF0A4780  int         3                           
; Run8                                                    
            var c = cards8[0];                            
00007FFEDF0B49A2  in          al,dx                       
00007FFEDF0B49A3  sub         byte ptr [rbx],dh           
00007FFEDF0B49A5  ror         byte ptr [rax-77h],44h      
00007FFEDF0B49A9  and         al,20h                      
00007FFEDF0B49AB  mov         rax,202902D0088h            
00007FFEDF0B49B5  mov         rax,qword ptr [rax]         
00007FFEDF0B49B8  cmp         dword ptr [rax+8],0         
00007FFEDF0B49BC  jbe         00007FFEDF0B49D8            
00007FFEDF0B49BE  mov         rax,qword ptr [rax+10h]     ; !!!
00007FFEDF0B49C2  mov         qword ptr [rsp+20h],rax     ; !!!
            return c.C0 - c.C1;                           
00007FFEDF0B49C7  movzx       eax,byte ptr [rsp+20h]      ; !!!
00007FFEDF0B49CC  movzx       edx,byte ptr [rsp+21h]      ; !!!
00007FFEDF0B49D1  sub         eax,edx                     
00007FFEDF0B49D3  add         rsp,28h                     
00007FFEDF0B49D7  ret                                     
00007FFEDF0B49D8  call        00007FFF3EB57BE0            
00007FFEDF0B49DD  int         3                           

As you can see, in the Run3 case, RyuJIT keeps the target bytes (C0, C1) in the edx, eax registers; in the Run8 case, RyuJIT keeps them on stack (qword ptr [rsp+20h]). Why? This may slightly degrade the performance of an application (see these benchmarks).

category:cq
theme:structs
skill-level:expert
cost:large
impact:large

Metadata

Metadata

Assignees

Labels

Priority:2Work that is important, but not critical for the releasearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions