-
Notifications
You must be signed in to change notification settings - Fork 5k
[RISC-V] Optimize comparisons #115039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISC-V] Optimize comparisons #115039
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
RISC-V Release-CLR-VF2: 9696 / 9751 (99.44%)
Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz Build information and commandsGIT: RISC-V Release-FX-QEMU: 701742 / 729040 (96.26%)
Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz Build information and commandsGIT: RISC-V Release-CLR-QEMU: 9703 / 9751 (99.51%)
Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz Build information and commandsGIT: |
No regressions. Diffs are based on 170,304 contexts (22,709 MinOpts, 147,595 FullOpts). MISSED contexts: 1,281 (0.75%) Overall (-240,156 bytes)
MinOpts (-20,236 bytes)
FullOpts (-219,920 bytes)
Example diffslinux.riscv64.Checked.mch-16 (-23.53%) : 126197.dasm - TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)@@ -24,21 +24,17 @@ G_M29992_IG01: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
G_M29992_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
not a1, a1
and a0, a1, a0
- slliw a0, a0, 0
- slli ra, a0, 32
- srli ra, ra, 32
- slli a1, a2, 32
- srli a1, a1, 32
- xor a0, ra, a1
+ sext.w a0, a0
+ subw a0, a0, a2
sltiu a0, a0, 1
- ;; size=36 bbWeight=1 PerfScore 4.50
+ ;; size=20 bbWeight=1 PerfScore 2.50
G_M29992_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 68, prolog size 16, PerfScore 21.00, instruction count 17, allocated bytes for code 68 (MethodHash=da348ad7) for method TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 52, prolog size 16, PerfScore 19.00, instruction count 13, allocated bytes for code 52 (MethodHash=da348ad7) for method TestBitwiseClearShift.Program:Bic(uint,uint,uint):ubyte (FullOpts)
; ============================================================
Unwind Info:
@@ -49,7 +45,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 17 (0x00011) Actual length = 68 (0x000044)
+ Function Length : 13 (0x0000d) Actual length = 52 (0x000034)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) -16 (-22.22%) : 126198.dasm - TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)@@ -25,21 +25,17 @@ G_M1083_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
slliw a1, a1, 4
not a1, a1
and a0, a1, a0
- slliw a0, a0, 0
- slli ra, a0, 32
- srli ra, ra, 32
- slli a1, a2, 32
- srli a1, a1, 32
- xor a0, ra, a1
+ sext.w a0, a0
+ subw a0, a0, a2
sltiu a0, a0, 1
- ;; size=40 bbWeight=1 PerfScore 5.00
+ ;; size=24 bbWeight=1 PerfScore 3.00
G_M1083_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 72, prolog size 16, PerfScore 21.50, instruction count 18, allocated bytes for code 72 (MethodHash=a71cfbc4) for method TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 56, prolog size 16, PerfScore 19.50, instruction count 14, allocated bytes for code 56 (MethodHash=a71cfbc4) for method TestBitwiseClearShift.Program:BicLSL(uint,uint,uint):ubyte (FullOpts)
; ============================================================
Unwind Info:
@@ -50,7 +46,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 18 (0x00012) Actual length = 72 (0x000048)
+ Function Length : 14 (0x0000e) Actual length = 56 (0x000038)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) -16 (-22.22%) : 126199.dasm - TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)@@ -25,21 +25,17 @@ G_M52261_IG02: ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
srliw a1, a1, 12
not a1, a1
and a0, a1, a0
- slliw a0, a0, 0
- slli ra, a0, 32
- srli ra, ra, 32
- slli a1, a2, 32
- srli a1, a1, 32
- xor a0, ra, a1
+ sext.w a0, a0
+ subw a0, a0, a2
sltiu a0, a0, 1
- ;; size=40 bbWeight=1 PerfScore 5.00
+ ;; size=24 bbWeight=1 PerfScore 3.00
G_M52261_IG03: ; bbWeight=1, epilog, nogc, extend
ld ra, 8(sp)
ld fp, 0(sp)
addi sp, sp, 16
ret ;; size=16 bbWeight=1 PerfScore 7.50
-; Total bytes of code 72, prolog size 16, PerfScore 21.50, instruction count 18, allocated bytes for code 72 (MethodHash=9d9d33da) for method TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)
+; Total bytes of code 56, prolog size 16, PerfScore 19.50, instruction count 14, allocated bytes for code 56 (MethodHash=9d9d33da) for method TestBitwiseClearShift.Program:BicLSR(uint,uint,uint):ubyte (FullOpts)
; ============================================================
Unwind Info:
@@ -50,7 +46,7 @@ Unwind Info:
E bit : 0
X bit : 0
Vers : 0
- Function Length : 18 (0x00012) Actual length = 72 (0x000048)
+ Function Length : 14 (0x0000e) Actual length = 56 (0x000038)
---- Epilog scopes ----
---- Scope 0
Epilog Start Offset : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e) +0 (0.00%) : 171536.dasm - Generated442:StructConstrainedInterfaceCallsTest() (FullOpts)No diffs found? +0 (0.00%) : 171472.dasm - ValueNumberingCheckedCastsOfConstants:g__ConfirmUInt64OneDecrementUnderUInt64MaxValueCastToUInt32Overflows|97_24() (FullOpts)No diffs found? +0 (0.00%) : 171440.dasm - ValueNumberingCheckedCastsOfConstants:g__ConfirmUInt32MaxValueCastToInt32Overflows|96_17() (FullOpts)No diffs found? DetailsSize improvements/regressions per collection
PerfScore improvements/regressions per collection
Context information
jit-analyze output |
553be87 is being scheduled for building and testingGIT: |
RISC-V Release-CLR-QEMU: 9063 / 9093 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-FX-QEMU: 284562 / 285637 (99.62%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-CLR-VF2: 9063 / 9093 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-FX-VF2: 511724 / 513391 (99.68%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
Conflicts: src/coreclr/jit/codegenriscv64.cpp
2512f2e
to
9c6721c
Compare
RISC-V Release-CLR-QEMU: 9072 / 9102 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-CLR-VF2: 9071 / 9101 (99.67%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-FX-QEMU: 284364 / 285427 (99.63%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: RISC-V Release-FX-VF2: 306401 / 308068 (99.46%)
report.xml, report.md, failures.xml, testclr_details.tar.zst Build information and commandsGIT: |
@dotnet/jit-contrib could you review? |
This is triggering errors in outerloop runs because of the test name:
|
Codegen-level optimizations, no inter-node analysis as of yet.
Part of #84834, cc @dotnet/samsung