Minimal fix for Issue 620 #40535

briansull · 2020-08-07T16:16:40Z

Turns a GT_IND(GT_ADDR(GT_LCL_VAR) into a GT_LCL_FLD when necessary for correctness.

Added test case: Runtime_620.cs
Added lvForceLoadNormalize

briansull · 2020-08-07T16:16:58Z

Fixes #620

briansull · 2020-08-07T16:19:01Z

Crossgen CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for x64 default jit
Total bytes of diff: 574 (0.00% of base) - diff is a regression.
9 total files with Code Size differences (0 improved, 9 regressed), 257 unchanged.
Top method regressions (bytes):
          48 ( 2.76% of base) : System.Private.CoreLib.dasm - Enumerator:MoveNext():bool:this (7 methods)
          27 ( 2.60% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.Symbols.SourceMethodSymbol:EarlyDecodeWellKnownAttribute(byref):Microsoft.CodeAnalysis.VisualBasic.Symbols.VisualBasicAttributeData:this
          24 ( 2.14% of base) : System.CodeDom.dasm - Microsoft.VisualBasic.VBCodeGenerator:QuoteSnippetString(System.String):System.String:this
          24 (22.43% of base) : System.Private.CoreLib.dasm - System.Text.Rune:ToString():System.String:this
          22 ( 2.71% of base) : System.Private.CoreLib.dasm - System.Reflection.MdConstant:GetValue(System.Reflection.MetadataImport,int,System.RuntimeTypeHandle,bool):System.Object
          21 (12.07% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_HasCallerLineNumberAttribute():bool:this
          21 (350.00% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetInt16WithAllBitsSet():short
          20 (333.33% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort
          20 (57.14% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryReadInt16BigEndian(System.ReadOnlySpan`1[Byte],byref):bool
          20 (16.39% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetInt16WithQuotes():short:this
          19 (316.67% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetSByteWithAllBitsSet():byte
          19 (55.88% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryReadUInt16BigEndian(System.ReadOnlySpan`1[Byte],byref):bool
          19 (15.57% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetSByteWithQuotes():byte:this
          19 (15.70% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetUInt16WithQuotes():ushort:this
          18 (10.47% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_HasCallerFilePathAttribute():bool:this
          18 (10.47% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_HasCallerMemberNameAttribute():bool:this
          18 (300.00% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetByteWithAllBitsSet():ubyte
          18 (72.00% of base) : System.Private.CoreLib.dasm - System.Span`1[SByte][System.SByte]:Fill(byte):this
          18 (15.25% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetByteWithQuotes():ubyte:this
          15 ( 7.77% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_IsCallerLineNumber():bool:this
          15 ( 7.28% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_IsCallerFilePath():bool:this
          15 ( 6.85% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEParameterSymbol:get_IsCallerMemberName():bool:this
          15 ( 0.50% of base) : System.Private.Xml.dasm - System.Xml.XmlConvert:DecodeName(System.String):System.String

Top method regressions (percentages):
          21 (350.00% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetInt16WithAllBitsSet():short
          20 (333.33% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort
          19 (316.67% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetSByteWithAllBitsSet():byte
          18 (300.00% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetByteWithAllBitsSet():ubyte
          18 (72.00% of base) : System.Private.CoreLib.dasm - System.Span`1[SByte][System.SByte]:Fill(byte):this
          20 (57.14% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryReadInt16BigEndian(System.ReadOnlySpan`1[Byte],byref):bool
          19 (55.88% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryReadUInt16BigEndian(System.ReadOnlySpan`1[Byte],byref):bool
          24 (22.43% of base) : System.Private.CoreLib.dasm - System.Text.Rune:ToString():System.String:this
          20 (16.39% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetInt16WithQuotes():short:this
          19 (15.70% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetUInt16WithQuotes():ushort:this
          19 (15.57% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetSByteWithQuotes():byte:this
          18 (15.25% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonReader:GetByteWithQuotes():ubyte:this

briansull · 2020-08-07T16:29:41Z

Sample regression:

Source code:

        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static ushort GetUInt16WithAllBitsSet()
        {
            ushort value = 0;
            unsafe
            {
                unchecked
                {
                    *((ushort*)&value) = (ushort)0xffff;
                }
            }
            return value;
        }

Original:

; Assembly listing for method System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;* V00 loc0         [V00,T00] (  0,  0   )  ushort  ->  zero-ref    ld-addr-op
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M13827_IG01:
						;; bbWeight=1    PerfScore 0.00
G_M13827_IG02:
       mov      eax, 0xFFFF
						;; bbWeight=1    PerfScore 0.25
G_M13827_IG03:
       ret      
						;; bbWeight=1    PerfScore 1.00

; Total bytes of code 6, prolog size 0, PerfScore 1.85, (MethodHash=ae96c9fc) for method System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort

New version:

; Assembly listing for method System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
; ReadyToRun compilation
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 loc0         [V00,T00] (  3,  3   )  ushort  ->  [rsp+0x04]   do-not-enreg[F] ld-addr-op
;# V01 OutArgs      [V01    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
;
; Lcl frame size = 8

G_M13827_IG01:
       push     rax
						;; bbWeight=1    PerfScore 1.00
G_M13827_IG02:
       xor      eax, eax
       mov      dword ptr [rsp+04H], eax
       mov      word  ptr [rsp+04H], 0xFFFF
       mov      eax, dword ptr [rsp+04H]
       movzx    rax, ax
						;; bbWeight=1    PerfScore 3.50
G_M13827_IG03:
       add      rsp, 8
       ret      
						;; bbWeight=1    PerfScore 1.25

; Total bytes of code 26, prolog size 1, PerfScore 8.35, (MethodHash=ae96c9fc) for method System.Numerics.ConstantHelper:GetUInt16WithAllBitsSet():ushort

Added test case: Runtime_620.cs Added lvForceLoadNormalize

briansull · 2020-08-07T17:27:44Z

@CarolEidt @dotnet/jit-contrib PTAL

briansull · 2020-08-07T17:45:10Z

runtime (Libraries Build Linux_musl x64 Debug) Failing after 14m

due to #40416

CarolEidt · 2020-08-07T18:49:51Z

src/coreclr/src/jit/morph.cpp

+                            //
+                            // For loads signed/unsigned differences do matter.
+                            //
+                            if (varTypeIsUnsigned(lvaTable[lclNum].lvType) == varTypeIsUnsigned(typ))


Should this be an else if? If not, I don't understand the "otherwise" comment above.

Yes, I will change this to an else if,
I reordered these two during a refactoring

A couple of additional methods had regressions:

21 (350.00% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetInt16WithAllBitsSet():short 19 (316.67% of base) : System.Private.CoreLib.dasm - System.Numerics.ConstantHelper:GetSByteWithAllBitsSet():byte

Additionally the BYTEMARK test IDEA also has a significant performace/codesize regression.
This existed before the change to "else if".

before: IDEA: Iterations/sec: 5542.04724 Index: 84.76411
after: IDEA: Iterations/sec: 3428.97638 Index: 52.44527

No other methods in the benchmarks had code diffs

Did you have a look at the IDEA regression?

Yes, it has many unsafe casts and is a awkward port of the C code:
The regression occurs in
private static void cipher_idea(byte[] xin, byte[] xout, int offset, char[] Z)
Where we inline the MUL and low16 methods, into the inner loop.
This forces x1, x4, t2 and t1 to become GT_LCL_FLD and thus memory accesses.

MUL(ref x1, Z[idx++]); MUL(ref x4, Z[idx++]); MUL(ref t2, Z[idx++]); MUL(ref t1, Z[idx++]); // These were macros in the original C code /* #define low16(x) ((x) & 0x0FFFF) */ private static char low16(int x) { return (char)((x) & 0x0FFFF); } /* #define MUL(x,y) (x=mul(low16(x),y)) */ private static void MUL(ref char x, char y) { x = mul(low16(x), y); }

My other fix does better with a much smaller regression:

before: IDEA: Iterations/sec: 5542.04724 Index: 84.76411 after: IDEA: Iterations/sec: 5244.63054 Index: 80.21521

If we can change/fix the source code that could be another option:

Original:

/* #define MUL(x,y) (x=mul(low16(x),y)) */ private static void MUL(ref char x, char y) { x = mul(low16(x), y); }

Modification:

/* #define MUL(x,y) (x=mul(low16(x),y)) */ private static char MUL(char x, char y) { return mul(low16(x), y); }

I tried with the above code modification and with the change all version produce the same Assembly code
and the execution times are about the same as the original version.

CarolEidt

LGTM - and thanks for adding the test!

CarolEidt · 2020-08-07T23:41:47Z

Thanks @briansull for the analysis. Would you be willing to open an issue to address the regression in .NET 6? I think it would make sense to add the optimization you'd proposed adding in your original fix, as well as possibly suggesting the improvement to the benchmark.

* Minimal fix for Issue 620 Added test case: Runtime_620.cs Added lvForceLoadNormalize * Changed conditional to "else if"

Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Aug 7, 2020

briansull mentioned this pull request Aug 7, 2020

Fix Issue 620 - Incorrect IND(ADDR(LCL_VAR)) folding when types do not match #40059

Closed

Minimal fix for Issue 620

43980f1

Added test case: Runtime_620.cs Added lvForceLoadNormalize

briansull force-pushed the orig_620 branch 2 times, most recently from bceadec to 43980f1 Compare August 7, 2020 17:09

CarolEidt reviewed Aug 7, 2020

View reviewed changes

Changed conditional to "else if"

b449313

CarolEidt approved these changes Aug 7, 2020

View reviewed changes

briansull merged commit e1494e8 into dotnet:master Aug 8, 2020

briansull mentioned this pull request Aug 8, 2020

Incorrect IND(ADDR(LCL_VAR)) folding when types do not match #620

Closed

briansull deleted the orig_620 branch August 8, 2020 19:59

Jacksondr5 pushed a commit to Jacksondr5/runtime that referenced this pull request Aug 10, 2020

Minimal fix for Issue 620 (dotnet#40535)

ed062dd

* Minimal fix for Issue 620 Added test case: Runtime_620.cs Added lvForceLoadNormalize * Changed conditional to "else if"

karelz added this to the 5.0.0 milestone Aug 18, 2020

This was referenced Aug 24, 2020

Alternative fix for folding of *(typ*)&lclVar for small types #40607 #40871

Merged

Test failure: System.Xml.Tests.SameInstanceXslTransformWriter.Variations(param0: \"xslt_mutith_attribute_sets.xsl\", param1: \"xslt_mutith_attribute_sets.xml\") #40607

Closed

AndyAyersMS mentioned this pull request Sep 2, 2020

Performance regression in ByteMark.BenchIDEAEncryption #41677

Closed

echesakov mentioned this pull request Sep 3, 2020

[release/5.0-rc2] Fix for folding of *(typ*)&lclVar for small types #41838

Merged

ghost locked as resolved and limited conversation to collaborators Dec 7, 2020

Uh oh!

Minimal fix for Issue 620 #40535

Minimal fix for Issue 620 #40535

Conversation

briansull commented Aug 7, 2020

Uh oh!

briansull commented Aug 7, 2020

Uh oh!

briansull commented Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

briansull commented Aug 7, 2020

Uh oh!

briansull commented Aug 7, 2020

Uh oh!

briansull commented Aug 7, 2020

Uh oh!

CarolEidt Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

CarolEidt Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

briansull Aug 7, 2020

Choose a reason for hiding this comment

Uh oh!

CarolEidt left a comment

Choose a reason for hiding this comment

Uh oh!

CarolEidt commented Aug 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

briansull commented Aug 7, 2020 •

edited

Loading

briansull Aug 7, 2020 •

edited

Loading

briansull Aug 7, 2020 •

edited

Loading

briansull Aug 7, 2020 •

edited

Loading