Skip to content

Conversation

@asb
Copy link
Contributor

@asb asb commented Jul 16, 2025

This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.


Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it.

asb added 2 commits July 16, 2025 07:46
…trs extend debug printing

RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have
one for the W=>Non-W transform done by stripWSuffixes. It also didn't do
debug printing of the transformation. This patch addresses both issues.
This is purely a benefit for reducing unnecessary diffs between RV32 and
RV64, as RVC does have a compressed form of SUBW (so SUB isn't more
compressible). This affects ~57.2k instructions in an rva22u64 build of
llvm-test-suite with SPEC CPU 2017 included.
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Alex Bradbury (asb)

Changes

This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.


Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it.


Patch is 165.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149071.diff

53 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp (+8)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll (+5-5)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll (+58-58)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+13-13)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/shifts.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll (+66-66)
  • (modified) llvm/test/CodeGen/RISCV/abds-neg.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/abds.ll (+31-63)
  • (modified) llvm/test/CodeGen/RISCV/addimm-mulimm.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/aext-to-sext.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/ctlz-cttz-ctpop.ll (+20-20)
  • (modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/div-by-constant.ll (+10-10)
  • (modified) llvm/test/CodeGen/RISCV/iabs.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/machine-combiner.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/mul.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/neg-abs.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/pr145360.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rotl-rotr.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-demanded-bits.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-exhaustive-w-insts.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-w-insts-legalization.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rv64xtheadbb.ll (+24-24)
  • (modified) llvm/test/CodeGen/RISCV/rv64zba.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rv64zbb-zbkb.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+30-30)
  • (modified) llvm/test/CodeGen/RISCV/rvv/expand-no-v.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+40-40)
  • (modified) llvm/test/CodeGen/RISCV/rvv/known-never-zero.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vec3-setcc-crash.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-vector-interleaved-access.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vrol-sdnode.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vror-sdnode.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vscale-power-of-two.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/select.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/sextw-removal.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/shifts.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/shl-cttz.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/typepromotion-overflow.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/urem-lkk.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll (+22-22)
  • (modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+16-16)
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 28d64031f8bcb..09a31fb2306de 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -48,6 +48,8 @@ using namespace llvm;
 STATISTIC(NumRemovedSExtW, "Number of removed sign-extensions");
 STATISTIC(NumTransformedToWInstrs,
           "Number of instructions transformed to W-ops");
+STATISTIC(NumTransformedToNonWInstrs,
+          "Number of instructions transformed to non-W-ops");
 
 static cl::opt<bool> DisableSExtWRemoval("riscv-disable-sextw-removal",
                                          cl::desc("Disable removal of sext.w"),
@@ -729,6 +731,7 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
   for (MachineBasicBlock &MBB : MF) {
     for (MachineInstr &MI : MBB) {
       unsigned Opc;
+      // clang-format off
       switch (MI.getOpcode()) {
       default:
         continue;
@@ -736,10 +739,15 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
       case RISCV::ADDIW: Opc = RISCV::ADDI; break;
       case RISCV::MULW:  Opc = RISCV::MUL;  break;
       case RISCV::SLLIW: Opc = RISCV::SLLI; break;
+      case RISCV::SUBW:  Opc = RISCV::SUB;  break;
       }
+      // clang-format on
 
       if (hasAllWUsers(MI, ST, MRI)) {
+        LLVM_DEBUG(dbgs() << "Replacing " << MI);
         MI.setDesc(TII.get(Opc));
+        LLVM_DEBUG(dbgs() << "     with " << MI);
+        ++NumTransformedToNonWInstrs;
         MadeChange = true;
       }
     }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
index 4b999b892ed35..2a93585ea6876 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
@@ -66,7 +66,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
 ; RV64IM-NEXT:    srli a2, a2, 32
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    srli a1, a1, 32
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    sub a0, a0, a1
 ; RV64IM-NEXT:    srliw a0, a0, 1
 ; RV64IM-NEXT:    add a0, a0, a1
 ; RV64IM-NEXT:    srliw a0, a0, 2
@@ -79,7 +79,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
 ; RV64IMZB-NEXT:    zext.w a2, a0
 ; RV64IMZB-NEXT:    mul a1, a2, a1
 ; RV64IMZB-NEXT:    srli a1, a1, 32
-; RV64IMZB-NEXT:    subw a0, a0, a1
+; RV64IMZB-NEXT:    sub a0, a0, a1
 ; RV64IMZB-NEXT:    srliw a0, a0, 1
 ; RV64IMZB-NEXT:    add a0, a0, a1
 ; RV64IMZB-NEXT:    srliw a0, a0, 2
@@ -250,7 +250,7 @@ define i8 @udiv8_constant_add(i8 %a) nounwind {
 ; RV64-NEXT:    zext.b a2, a0
 ; RV64-NEXT:    mul a1, a2, a1
 ; RV64-NEXT:    srli a1, a1, 8
-; RV64-NEXT:    subw a0, a0, a1
+; RV64-NEXT:    sub a0, a0, a1
 ; RV64-NEXT:    zext.b a0, a0
 ; RV64-NEXT:    srli a0, a0, 1
 ; RV64-NEXT:    add a0, a0, a1
@@ -816,7 +816,7 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    slli a1, a1, 48
 ; RV64IM-NEXT:    srai a1, a1, 56
-; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    sub a1, a1, a0
 ; RV64IM-NEXT:    slli a1, a1, 56
 ; RV64IM-NEXT:    srai a0, a1, 58
 ; RV64IM-NEXT:    zext.b a1, a0
@@ -1071,7 +1071,7 @@ define i16 @sdiv16_constant_sub_srai(i16 %a) nounwind {
 ; RV64IM-NEXT:    srai a2, a2, 48
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    sraiw a1, a1, 16
-; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    sub a1, a1, a0
 ; RV64IM-NEXT:    slli a1, a1, 48
 ; RV64IM-NEXT:    srai a0, a1, 51
 ; RV64IM-NEXT:    slli a1, a0, 48
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
index 8a786fc9993d2..46d1661983c6a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
@@ -29,7 +29,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -55,7 +55,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_32:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sllw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -78,7 +78,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -104,7 +104,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_32:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srlw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -167,7 +167,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_64:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sll a1, a0, a1
 ; RV64I-NEXT:    srl a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -276,7 +276,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_64:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sll a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -340,7 +340,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_64:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srl a1, a0, a1
 ; RV64I-NEXT:    sll a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -451,7 +451,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_64:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srl a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -474,7 +474,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_32_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -490,7 +490,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotl_32_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    sllw a1, a0, a1
 ; RV64ZBB-NEXT:    srlw a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -506,7 +506,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_32_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sllw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -531,7 +531,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64I-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    sllw a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    srlw a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -547,7 +547,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64ZBB-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    sllw a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    srlw a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -563,7 +563,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    sllw a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -632,7 +632,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_32_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -648,7 +648,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotr_32_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    srlw a1, a0, a1
 ; RV64ZBB-NEXT:    sllw a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -664,7 +664,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_32_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srlw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -689,7 +689,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64I-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    srlw a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    sllw a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -705,7 +705,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64ZBB-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    srlw a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    sllw a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -721,7 +721,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    srlw a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -829,7 +829,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_64_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sll a1, a0, a1
 ; RV64I-NEXT:    srl a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -884,7 +884,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotl_64_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    sll a1, a0, a1
 ; RV64ZBB-NEXT:    srl a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -939,7 +939,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_64_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sll a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -1005,7 +1005,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64I-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    sll a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    srl a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -1062,7 +1062,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64ZBB-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    sll a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    srl a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -1119,7 +1119,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    sll a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -1277,7 +1277,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_64_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srl a1, a0, a1
 ; RV64I-NEXT:    sll a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -1331,7 +1331,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotr_64_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    srl a1, a0, a1
 ; RV64ZBB-NEXT:    sll a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -1385,7 +1385,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_64_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srl a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -1451,7 +1451,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64I-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    srl a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    sll a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -1508,7 +1508,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64ZBB-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    srl a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    sll a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -1565,7 +1565,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    srl a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -1701,7 +1701,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    sllw a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srlw a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sllw a1, a1, a2
@@ -1737,7 +1737,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    sllw a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a2
@@ -1822,7 +1822,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    sll a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srl a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sll a1, a1, a2
@@ -1972,7 +1972,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    sll a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sll a1, a1, a2
@@ -2002,7 +2002,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    srlw a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sllw a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sllw a1, a1, a2
@@ -2038,7 +2038,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    srlw a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a2
@@ -2125,7 +2125,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    srl a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sll a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sll a1, a1, a2
@@ -2279,7 +2279,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    srl a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sll a1, a1, a2
@@ -2312,8 +2312,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    sllw a4, a0, a2
 ; RV64I-NEXT:    sllw a2, a1, a2
-; RV64I-NEXT:    negw a5, a3
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a5, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srlw a0, a0, a5
 ; RV64I-NEXT:    srlw a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2353,8 +2353,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    sllw a4, a0, a2
 ; RV64XTHEADBB-NEXT:    sllw a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a5, a3
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a5, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a5
 ; RV64XTHEADBB-NEXT:    srlw a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2464,7 +2464,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    sll a4, a0, a2
 ; RV64I-NEXT:    sll a2, a1, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srl a0, a0, a3
 ; RV64I-NEXT:    srl a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2664,7 +2664,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    sll a4, a0, a2
 ; RV64XTHEADBB-NEXT:    sll a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a3
 ; RV64XTHEADBB-NEXT:    srl a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2697,8 +2697,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    srlw a4, a0, a2
 ; RV64I-NEXT:    srlw a2, a1, a2
-; RV64I-NEXT:    negw a5, a3
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a5, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sllw a0, a0, a5
 ; RV64I-NEXT:    sllw a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2738,8 +2738,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    srlw a4, a0, a2
 ; RV64XTHEADBB-NEXT:    srlw a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a5, a3
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a5, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a5
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2850,7 +2850,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    srl a4, a0, a2
 ; RV64I-NEXT:    srl a2, a1, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sll a0, a0, a3
 ; RV64I-NEXT:    sll a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -3052,7 +3052,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    srl a4, a0, a2
 ; RV64XTHEADBB-NEXT:    srl ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-llvm-globalisel

Author: Alex Bradbury (asb)

Changes

This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.


Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it.


Patch is 165.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149071.diff

53 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp (+8)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll (+5-5)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll (+58-58)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+13-13)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/shifts.ll (+12-12)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/wide-scalar-shift-by-byte-multiple-legalization.ll (+66-66)
  • (modified) llvm/test/CodeGen/RISCV/abds-neg.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/abds.ll (+31-63)
  • (modified) llvm/test/CodeGen/RISCV/addimm-mulimm.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/aext-to-sext.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-cond-sub-clamp.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/atomicrmw-uinc-udec-wrap.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/ctlz-cttz-ctpop.ll (+20-20)
  • (modified) llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/div-by-constant.ll (+10-10)
  • (modified) llvm/test/CodeGen/RISCV/iabs.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts-vscale.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/intrinsic-cttz-elts.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/machine-combiner.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/mul.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/neg-abs.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/overflow-intrinsics.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/pr145360.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rotl-rotr.ll (+48-48)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-demanded-bits.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-exhaustive-w-insts.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rv64i-w-insts-legalization.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rv64xtheadbb.ll (+24-24)
  • (modified) llvm/test/CodeGen/RISCV/rv64zba.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rv64zbb-zbkb.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rv64zbb.ll (+30-30)
  • (modified) llvm/test/CodeGen/RISCV/rvv/expand-no-v.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fpclamptosat_vec.ll (+40-40)
  • (modified) llvm/test/CodeGen/RISCV/rvv/known-never-zero.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/rvv/sink-splat-operands.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vandn-sdnode.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vec3-setcc-crash.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vp-vector-interleaved-access.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vrol-sdnode.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vror-sdnode.ll (+4-4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vscale-power-of-two.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/select.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/sextw-removal.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/shifts.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/shl-cttz.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/srem-vector-lkk.ll (+26-26)
  • (modified) llvm/test/CodeGen/RISCV/typepromotion-overflow.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/urem-lkk.ll (+2-2)
  • (modified) llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll (+22-22)
  • (modified) llvm/test/CodeGen/RISCV/urem-vector-lkk.ll (+16-16)
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 28d64031f8bcb..09a31fb2306de 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -48,6 +48,8 @@ using namespace llvm;
 STATISTIC(NumRemovedSExtW, "Number of removed sign-extensions");
 STATISTIC(NumTransformedToWInstrs,
           "Number of instructions transformed to W-ops");
+STATISTIC(NumTransformedToNonWInstrs,
+          "Number of instructions transformed to non-W-ops");
 
 static cl::opt<bool> DisableSExtWRemoval("riscv-disable-sextw-removal",
                                          cl::desc("Disable removal of sext.w"),
@@ -729,6 +731,7 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
   for (MachineBasicBlock &MBB : MF) {
     for (MachineInstr &MI : MBB) {
       unsigned Opc;
+      // clang-format off
       switch (MI.getOpcode()) {
       default:
         continue;
@@ -736,10 +739,15 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
       case RISCV::ADDIW: Opc = RISCV::ADDI; break;
       case RISCV::MULW:  Opc = RISCV::MUL;  break;
       case RISCV::SLLIW: Opc = RISCV::SLLI; break;
+      case RISCV::SUBW:  Opc = RISCV::SUB;  break;
       }
+      // clang-format on
 
       if (hasAllWUsers(MI, ST, MRI)) {
+        LLVM_DEBUG(dbgs() << "Replacing " << MI);
         MI.setDesc(TII.get(Opc));
+        LLVM_DEBUG(dbgs() << "     with " << MI);
+        ++NumTransformedToNonWInstrs;
         MadeChange = true;
       }
     }
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
index 4b999b892ed35..2a93585ea6876 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
@@ -66,7 +66,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
 ; RV64IM-NEXT:    srli a2, a2, 32
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    srli a1, a1, 32
-; RV64IM-NEXT:    subw a0, a0, a1
+; RV64IM-NEXT:    sub a0, a0, a1
 ; RV64IM-NEXT:    srliw a0, a0, 1
 ; RV64IM-NEXT:    add a0, a0, a1
 ; RV64IM-NEXT:    srliw a0, a0, 2
@@ -79,7 +79,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
 ; RV64IMZB-NEXT:    zext.w a2, a0
 ; RV64IMZB-NEXT:    mul a1, a2, a1
 ; RV64IMZB-NEXT:    srli a1, a1, 32
-; RV64IMZB-NEXT:    subw a0, a0, a1
+; RV64IMZB-NEXT:    sub a0, a0, a1
 ; RV64IMZB-NEXT:    srliw a0, a0, 1
 ; RV64IMZB-NEXT:    add a0, a0, a1
 ; RV64IMZB-NEXT:    srliw a0, a0, 2
@@ -250,7 +250,7 @@ define i8 @udiv8_constant_add(i8 %a) nounwind {
 ; RV64-NEXT:    zext.b a2, a0
 ; RV64-NEXT:    mul a1, a2, a1
 ; RV64-NEXT:    srli a1, a1, 8
-; RV64-NEXT:    subw a0, a0, a1
+; RV64-NEXT:    sub a0, a0, a1
 ; RV64-NEXT:    zext.b a0, a0
 ; RV64-NEXT:    srli a0, a0, 1
 ; RV64-NEXT:    add a0, a0, a1
@@ -816,7 +816,7 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    slli a1, a1, 48
 ; RV64IM-NEXT:    srai a1, a1, 56
-; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    sub a1, a1, a0
 ; RV64IM-NEXT:    slli a1, a1, 56
 ; RV64IM-NEXT:    srai a0, a1, 58
 ; RV64IM-NEXT:    zext.b a1, a0
@@ -1071,7 +1071,7 @@ define i16 @sdiv16_constant_sub_srai(i16 %a) nounwind {
 ; RV64IM-NEXT:    srai a2, a2, 48
 ; RV64IM-NEXT:    mul a1, a2, a1
 ; RV64IM-NEXT:    sraiw a1, a1, 16
-; RV64IM-NEXT:    subw a1, a1, a0
+; RV64IM-NEXT:    sub a1, a1, a0
 ; RV64IM-NEXT:    slli a1, a1, 48
 ; RV64IM-NEXT:    srai a0, a1, 51
 ; RV64IM-NEXT:    slli a1, a0, 48
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
index 8a786fc9993d2..46d1661983c6a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
@@ -29,7 +29,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -55,7 +55,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_32:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sllw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -78,7 +78,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -104,7 +104,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_32:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srlw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -167,7 +167,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_64:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sll a1, a0, a1
 ; RV64I-NEXT:    srl a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -276,7 +276,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_64:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sll a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -340,7 +340,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_64:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srl a1, a0, a1
 ; RV64I-NEXT:    sll a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -451,7 +451,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_64:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srl a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -474,7 +474,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_32_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -490,7 +490,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotl_32_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    sllw a1, a0, a1
 ; RV64ZBB-NEXT:    srlw a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -506,7 +506,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_32_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sllw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -531,7 +531,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64I-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    sllw a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    srlw a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -547,7 +547,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64ZBB-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    sllw a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    srlw a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -563,7 +563,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotl_32_mask_and_63_and_31:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    sllw a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -632,7 +632,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_32_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -648,7 +648,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotr_32_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    srlw a1, a0, a1
 ; RV64ZBB-NEXT:    sllw a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -664,7 +664,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_32_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srlw a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -689,7 +689,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64I-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    srlw a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    sllw a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -705,7 +705,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64ZBB-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    srlw a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    sllw a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -721,7 +721,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotr_32_mask_and_63_and_31:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    srlw a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -829,7 +829,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotl_64_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    sll a1, a0, a1
 ; RV64I-NEXT:    srl a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -884,7 +884,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotl_64_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    sll a1, a0, a1
 ; RV64ZBB-NEXT:    srl a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -939,7 +939,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotl_64_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    sll a1, a0, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -1005,7 +1005,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64I-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    sll a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    srl a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -1062,7 +1062,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64ZBB-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    sll a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    srl a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -1119,7 +1119,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotl_64_mask_and_127_and_63:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    sll a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -1277,7 +1277,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64I-LABEL: rotr_64_mask:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    negw a2, a1
+; RV64I-NEXT:    neg a2, a1
 ; RV64I-NEXT:    srl a1, a0, a1
 ; RV64I-NEXT:    sll a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -1331,7 +1331,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64ZBB-LABEL: rotr_64_mask:
 ; RV64ZBB:       # %bb.0:
-; RV64ZBB-NEXT:    negw a2, a1
+; RV64ZBB-NEXT:    neg a2, a1
 ; RV64ZBB-NEXT:    srl a1, a0, a1
 ; RV64ZBB-NEXT:    sll a0, a0, a2
 ; RV64ZBB-NEXT:    or a0, a1, a0
@@ -1385,7 +1385,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
 ;
 ; RV64XTHEADBB-LABEL: rotr_64_mask:
 ; RV64XTHEADBB:       # %bb.0:
-; RV64XTHEADBB-NEXT:    negw a2, a1
+; RV64XTHEADBB-NEXT:    neg a2, a1
 ; RV64XTHEADBB-NEXT:    srl a1, a0, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a2
 ; RV64XTHEADBB-NEXT:    or a0, a1, a0
@@ -1451,7 +1451,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64I-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    srl a2, a0, a1
-; RV64I-NEXT:    negw a1, a1
+; RV64I-NEXT:    neg a1, a1
 ; RV64I-NEXT:    sll a0, a0, a1
 ; RV64I-NEXT:    or a0, a2, a0
 ; RV64I-NEXT:    ret
@@ -1508,7 +1508,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64ZBB-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64ZBB:       # %bb.0:
 ; RV64ZBB-NEXT:    srl a2, a0, a1
-; RV64ZBB-NEXT:    negw a1, a1
+; RV64ZBB-NEXT:    neg a1, a1
 ; RV64ZBB-NEXT:    sll a0, a0, a1
 ; RV64ZBB-NEXT:    or a0, a2, a0
 ; RV64ZBB-NEXT:    ret
@@ -1565,7 +1565,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
 ; RV64XTHEADBB-LABEL: rotr_64_mask_and_127_and_63:
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    srl a2, a0, a1
-; RV64XTHEADBB-NEXT:    negw a1, a1
+; RV64XTHEADBB-NEXT:    neg a1, a1
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a1
 ; RV64XTHEADBB-NEXT:    or a0, a2, a0
 ; RV64XTHEADBB-NEXT:    ret
@@ -1701,7 +1701,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    sllw a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srlw a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sllw a1, a1, a2
@@ -1737,7 +1737,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    sllw a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a2
@@ -1822,7 +1822,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    sll a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srl a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sll a1, a1, a2
@@ -1972,7 +1972,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    sll a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sll a1, a1, a2
@@ -2002,7 +2002,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    srlw a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sllw a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sllw a1, a1, a2
@@ -2038,7 +2038,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    srlw a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a2
@@ -2125,7 +2125,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    srl a4, a0, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sll a0, a0, a3
 ; RV64I-NEXT:    or a0, a4, a0
 ; RV64I-NEXT:    sll a1, a1, a2
@@ -2279,7 +2279,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
 ; RV64XTHEADBB:       # %bb.0:
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    srl a4, a0, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sll a0, a0, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
 ; RV64XTHEADBB-NEXT:    sll a1, a1, a2
@@ -2312,8 +2312,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    sllw a4, a0, a2
 ; RV64I-NEXT:    sllw a2, a1, a2
-; RV64I-NEXT:    negw a5, a3
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a5, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srlw a0, a0, a5
 ; RV64I-NEXT:    srlw a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2353,8 +2353,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    sllw a4, a0, a2
 ; RV64XTHEADBB-NEXT:    sllw a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a5, a3
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a5, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srlw a0, a0, a5
 ; RV64XTHEADBB-NEXT:    srlw a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2464,7 +2464,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    sll a4, a0, a2
 ; RV64I-NEXT:    sll a2, a1, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    srl a0, a0, a3
 ; RV64I-NEXT:    srl a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2664,7 +2664,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    sll a4, a0, a2
 ; RV64XTHEADBB-NEXT:    sll a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    srl a0, a0, a3
 ; RV64XTHEADBB-NEXT:    srl a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2697,8 +2697,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64I-NEXT:    andi a3, a2, 31
 ; RV64I-NEXT:    srlw a4, a0, a2
 ; RV64I-NEXT:    srlw a2, a1, a2
-; RV64I-NEXT:    negw a5, a3
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a5, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sllw a0, a0, a5
 ; RV64I-NEXT:    sllw a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -2738,8 +2738,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 31
 ; RV64XTHEADBB-NEXT:    srlw a4, a0, a2
 ; RV64XTHEADBB-NEXT:    srlw a2, a1, a2
-; RV64XTHEADBB-NEXT:    negw a5, a3
-; RV64XTHEADBB-NEXT:    negw a3, a3
+; RV64XTHEADBB-NEXT:    neg a5, a3
+; RV64XTHEADBB-NEXT:    neg a3, a3
 ; RV64XTHEADBB-NEXT:    sllw a0, a0, a5
 ; RV64XTHEADBB-NEXT:    sllw a1, a1, a3
 ; RV64XTHEADBB-NEXT:    or a0, a4, a0
@@ -2850,7 +2850,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64I-NEXT:    andi a3, a2, 63
 ; RV64I-NEXT:    srl a4, a0, a2
 ; RV64I-NEXT:    srl a2, a1, a2
-; RV64I-NEXT:    negw a3, a3
+; RV64I-NEXT:    neg a3, a3
 ; RV64I-NEXT:    sll a0, a0, a3
 ; RV64I-NEXT:    sll a1, a1, a3
 ; RV64I-NEXT:    or a0, a4, a0
@@ -3052,7 +3052,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
 ; RV64XTHEADBB-NEXT:    andi a3, a2, 63
 ; RV64XTHEADBB-NEXT:    srl a4, a0, a2
 ; RV64XTHEADBB-NEXT:    srl ...
[truncated]

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

asb added a commit that referenced this pull request Jul 20, 2025
…trs extend debug printing

RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have
one for the W=>Non-W transform done by stripWSuffixes. It also didn't do
debug printing of the transformation. This patch addresses both issues.

Reviewed as part of <#149071>,
but landing separately.
@asb asb merged commit c58225f into llvm:main Jul 20, 2025
7 of 9 checks passed
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 20, 2025
…ISCVOptWInstrs extend debug printing

RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have
one for the W=>Non-W transform done by stripWSuffixes. It also didn't do
debug printing of the transformation. This patch addresses both issues.

Reviewed as part of <llvm/llvm-project#149071>,
but landing separately.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
…trs extend debug printing

RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have
one for the W=>Non-W transform done by stripWSuffixes. It also didn't do
debug printing of the transformation. This patch addresses both issues.

Reviewed as part of <llvm#149071>,
but landing separately.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
This is purely a benefit for reducing unnecessary diffs between RV32 and
RV64, as RVC does have a compressed form of SUBW (so SUB isn't more
compressible). This affects ~57.2k instructions in an rva22u64 build of
llvm-test-suite with SPEC CPU 2017 included.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants