Skip to content

[RISC-V] Optimize comparisons #115039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 27, 2025
Merged

Conversation

tomeksowi
Copy link
Contributor

@tomeksowi tomeksowi commented Apr 25, 2025

Codegen-level optimizations, no inter-node analysis as of yet.

Part of #84834, cc @dotnet/samsung

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 25, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 25, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@risc-vv
Copy link

risc-vv commented Apr 25, 2025

RISC-V Release-CLR-VF2: 9696 / 9751 (99.44%)

=======================
      passed: 9696
      failed: 33
     skipped: 70
      killed: 22
------------------------
  TOTAL libs: 9821
 TOTAL tests: 9821
   REAL time: 2h 9min 29s 806ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: 9bfe1ba3f8b4fb04405981ad0672990484bad026
CI: 2d916d20de463f9bba05ae71b3d1f37d439a8cb1
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 701742 / 729040 (96.26%)

=======================
      passed: 701742
      failed: 1673
     skipped: 1533
      killed: 25625
------------------------
  TOTAL libs: 259
 TOTAL tests: 730573
   REAL time: 2h 30min 24s 987ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: 9bfe1ba3f8b4fb04405981ad0672990484bad026
CI: 2d916d20de463f9bba05ae71b3d1f37d439a8cb1
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9703 / 9751 (99.51%)

=======================
      passed: 9703
      failed: 28
     skipped: 70
      killed: 20
------------------------
  TOTAL libs: 9821
 TOTAL tests: 9821
   REAL time: 3h 22min 31s 361ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: 9bfe1ba3f8b4fb04405981ad0672990484bad026
CI: 2d916d20de463f9bba05ae71b3d1f37d439a8cb1
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

@tomeksowi
Copy link
Contributor Author

No regressions.

Diffs are based on 170,304 contexts (22,709 MinOpts, 147,595 FullOpts).

MISSED contexts: 1,281 (0.75%)

Overall (-240,156 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 115,677,936 -240,156 -0.39%
MinOpts (-20,236 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 45,050,272 -20,236 -0.06%
FullOpts (-219,920 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 70,627,664 -219,920 -0.47%
Example diffs
linux.riscv64.Checked.mch
-16 (-23.53%) : 126197.dasm - TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)
@@ -24,21 +24,17 @@ G_M29992_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
 G_M29992_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             not            a1, a1
             and            a0, a1, a0
-            slliw          a0, a0, 0
-            slli           ra, a0, 32
-            srli           ra, ra, 32
-            slli           a1, a2, 32
-            srli           a1, a1, 32
-            xor            a0, ra, a1
+            sext.w         a0, a0
+            subw           a0, a0, a2
             sltiu          a0, a0, 1
-						;; size=36 bbWeight=1 PerfScore 4.50
+						;; size=20 bbWeight=1 PerfScore 2.50
 G_M29992_IG03:        ; bbWeight=1, epilog, nogc, extend
             ld             ra, 8(sp)
             ld             fp, 0(sp)
             addi           sp, sp, 16
             ret						;; size=16 bbWeight=1 PerfScore 7.50
 
-; Total bytes of code 68, prolog size 16, PerfScore 21.00, instruction count 17, allocated bytes for code 68 (MethodHash=da348ad7) for method TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 52, prolog size 16, PerfScore 19.00, instruction count 13, allocated bytes for code 52 (MethodHash=da348ad7) for method TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -49,7 +45,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 17 (0x00011) Actual length = 68 (0x000044)
+  Function Length   : 13 (0x0000d) Actual length = 52 (0x000034)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-16 (-22.22%) : 126198.dasm - TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)
@@ -25,21 +25,17 @@ G_M1083_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             slliw          a1, a1, 4
             not            a1, a1
             and            a0, a1, a0
-            slliw          a0, a0, 0
-            slli           ra, a0, 32
-            srli           ra, ra, 32
-            slli           a1, a2, 32
-            srli           a1, a1, 32
-            xor            a0, ra, a1
+            sext.w         a0, a0
+            subw           a0, a0, a2
             sltiu          a0, a0, 1
-						;; size=40 bbWeight=1 PerfScore 5.00
+						;; size=24 bbWeight=1 PerfScore 3.00
 G_M1083_IG03:        ; bbWeight=1, epilog, nogc, extend
             ld             ra, 8(sp)
             ld             fp, 0(sp)
             addi           sp, sp, 16
             ret						;; size=16 bbWeight=1 PerfScore 7.50
 
-; Total bytes of code 72, prolog size 16, PerfScore 21.50, instruction count 18, allocated bytes for code 72 (MethodHash=a71cfbc4) for method TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 56, prolog size 16, PerfScore 19.50, instruction count 14, allocated bytes for code 56 (MethodHash=a71cfbc4) for method TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -50,7 +46,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 18 (0x00012) Actual length = 72 (0x000048)
+  Function Length   : 14 (0x0000e) Actual length = 56 (0x000038)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-16 (-22.22%) : 126199.dasm - TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)
@@ -25,21 +25,17 @@ G_M52261_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             srliw          a1, a1, 12
             not            a1, a1
             and            a0, a1, a0
-            slliw          a0, a0, 0
-            slli           ra, a0, 32
-            srli           ra, ra, 32
-            slli           a1, a2, 32
-            srli           a1, a1, 32
-            xor            a0, ra, a1
+            sext.w         a0, a0
+            subw           a0, a0, a2
             sltiu          a0, a0, 1
-						;; size=40 bbWeight=1 PerfScore 5.00
+						;; size=24 bbWeight=1 PerfScore 3.00
 G_M52261_IG03:        ; bbWeight=1, epilog, nogc, extend
             ld             ra, 8(sp)
             ld             fp, 0(sp)
             addi           sp, sp, 16
             ret						;; size=16 bbWeight=1 PerfScore 7.50
 
-; Total bytes of code 72, prolog size 16, PerfScore 21.50, instruction count 18, allocated bytes for code 72 (MethodHash=9d9d33da) for method TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 56, prolog size 16, PerfScore 19.50, instruction count 14, allocated bytes for code 56 (MethodHash=9d9d33da) for method TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -50,7 +46,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 18 (0x00012) Actual length = 72 (0x000048)
+  Function Length   : 14 (0x0000e) Actual length = 56 (0x000038)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 171536.dasm - Generated442:StructConstrainedInterfaceCallsTest() (FullOpts)

No diffs found?

+0 (0.00%) : 171472.dasm - ValueNumberingCheckedCastsOfConstants:g__ConfirmUInt64OneDecrementUnderUInt64MaxValueCastToUInt32Overflows|97_24() (FullOpts)

No diffs found?

+0 (0.00%) : 171440.dasm - ValueNumberingCheckedCastsOfConstants:g__ConfirmUInt32MaxValueCastToInt32Overflows|96_17() (FullOpts)

No diffs found?

Details

Size improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same size Improvements (bytes) Regressions (bytes)
linux.riscv64.Checked.mch 23,204 10,757 0 12,447 -240,156 +0

PerfScore improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same PerfScore Improvements (PerfScore) Regressions (PerfScore) PerfScore Overall in FullOpts
linux.riscv64.Checked.mch 23,204 10,670 0 12,534 -0.84% 0.00% -0.0588%

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
linux.riscv64.Checked.mch 170,304 22,709 147,595 1,281 (0.75%) 1,281 (0.75%)

jit-analyze output

@am11 am11 added the arch-riscv Related to the RISC-V architecture label Apr 25, 2025
@risc-vv
Copy link

risc-vv commented May 5, 2025

553be87 is being scheduled for building and testing

GIT: 553be8748e58d0bf3908b9f08f547e32fd027901
REPO: tomeksowi/runtime
BRANCH: compare-cleanup

@risc-vv
Copy link

risc-vv commented May 7, 2025

RISC-V Release-CLR-QEMU: 9063 / 9093 (99.67%)
=======================
      passed: 9063
      failed: 2
     skipped: 597
      killed: 28
------------------------
 TOTAL tests: 9690
VIRTUAL time: 29h 34min 7s 380ms
   REAL time: 42min 50s 344ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 2512f2e464451896cea138815f4d2cf46721f0a6
CI: fb90762a15dc605159873a3c1988381b6a288350
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 284562 / 285637 (99.62%)
=======================
      passed: 284562
      failed: 1069
     skipped: 38
      killed: 6
------------------------
 TOTAL tests: 285675
VIRTUAL time: 28h 11min 31s 692ms
   REAL time: 1h 11min 37s 129ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 2512f2e464451896cea138815f4d2cf46721f0a6
CI: fb90762a15dc605159873a3c1988381b6a288350
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-VF2: 9063 / 9093 (99.67%)
=======================
      passed: 9063
      failed: 2
     skipped: 597
      killed: 28
------------------------
 TOTAL tests: 9690
VIRTUAL time: 10h 58min 43s 841ms
   REAL time: 48min 41s 304ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 2512f2e464451896cea138815f4d2cf46721f0a6
CI: fb90762a15dc605159873a3c1988381b6a288350
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-VF2: 511724 / 513391 (99.68%)
=======================
      passed: 511724
      failed: 1659
     skipped: 38
      killed: 8
------------------------
 TOTAL tests: 513429
VIRTUAL time: 20h 4min 57s 892ms
   REAL time: 2h 8min 54s 982ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 2512f2e464451896cea138815f4d2cf46721f0a6
CI: fb90762a15dc605159873a3c1988381b6a288350
REPO: tomeksowi/runtime
BRANCH: compare-cleanup
CONFIG: Release
LIB_CONFIG: Release

@tomeksowi tomeksowi marked this pull request as ready for review May 7, 2025 15:28
tomeksowi added 2 commits May 16, 2025 11:36
Conflicts:
	src/coreclr/jit/codegenriscv64.cpp
@risc-vv
Copy link

risc-vv commented May 16, 2025

RISC-V Release-CLR-QEMU: 9072 / 9102 (99.67%)
=======================
      passed: 9072
      failed: 2
     skipped: 597
      killed: 28
------------------------
 TOTAL tests: 9699
VIRTUAL time: 29h 32min 37s 973ms
   REAL time: 42min 19s 191ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 9c6721c4beacff9bac32599184787bfec7038ae0
CI: 85a71e207aad1e1aa4a72cc7f52d713d9cc79191
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-VF2: 9071 / 9101 (99.67%)
=======================
      passed: 9071
      failed: 2
     skipped: 597
      killed: 28
------------------------
 TOTAL tests: 9698
VIRTUAL time: 11h 10min 24s 279ms
   REAL time: 49min 18s 162ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 9c6721c4beacff9bac32599184787bfec7038ae0
CI: 85a71e207aad1e1aa4a72cc7f52d713d9cc79191
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 284364 / 285427 (99.63%)
=======================
      passed: 284364
      failed: 1057
     skipped: 38
      killed: 6
------------------------
 TOTAL tests: 285465
VIRTUAL time: 27h 58min 7s 562ms
   REAL time: 1h 10min 27s 532ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 9c6721c4beacff9bac32599184787bfec7038ae0
CI: 85a71e207aad1e1aa4a72cc7f52d713d9cc79191
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-VF2: 306401 / 308068 (99.46%)
=======================
      passed: 306401
      failed: 1659
     skipped: 38
      killed: 8
------------------------
 TOTAL tests: 308106
VIRTUAL time: 18h 34min 48s 575ms
   REAL time: 2h 14min 21s 126ms
=======================

report.xml, report.md, failures.xml, testclr_details.tar.zst

Build information and commands

GIT: 9c6721c4beacff9bac32599184787bfec7038ae0
CI: 85a71e207aad1e1aa4a72cc7f52d713d9cc79191
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@tomeksowi
Copy link
Contributor Author

@dotnet/jit-contrib could you review?

@clamp03 clamp03 requested a review from jakobbotsch May 26, 2025 02:46
@jakobbotsch jakobbotsch merged commit ec351aa into dotnet:main May 27, 2025
115 of 119 checks passed
@filipnavara
Copy link
Member

This is triggering errors in outerloop runs because of the test name:

/__w/1/s/src/tests/Common/mergedrunner.targets(18,5): error : This project has an assembly name identical to another project, if this CoreCLRTestLibrary, you should reference $(TestLibraryProjectPath) instead of constructing the path yourself: /__w/1/s/src/tests/JIT/Directed/shift/int32_r.csproj [/__w/1/s/src/tests/JIT/Directed/Directed_2.csproj] [/__w/1/s/src/tests/build.proj]
/__w/1/s/src/tests/Common/mergedrunner.targets(18,5): error : This project has an assembly name identical to another project, if this CoreCLRTestLibrary, you should reference $(TestLibraryProjectPath) instead of constructing the path yourself: /__w/1/s/src/tests/JIT/Directed/shift/int32_ro.csproj [/__w/1/s/src/tests/JIT/Directed/Directed_2.csproj] [/__w/1/s/src/tests/build.proj]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-riscv Related to the RISC-V architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants