-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[RISCV] Add RISCV::SUBW to RISCVOptWInstrs::stripWSuffixes #149071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Add RISCV::SUBW to RISCVOptWInstrs::stripWSuffixes #149071
Conversation
…trs extend debug printing RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have one for the W=>Non-W transform done by stripWSuffixes. It also didn't do debug printing of the transformation. This patch addresses both issues.
This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.
|
@llvm/pr-subscribers-backend-risc-v Author: Alex Bradbury (asb) ChangesThis is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included. Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it. Patch is 165.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149071.diff 53 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 28d64031f8bcb..09a31fb2306de 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -48,6 +48,8 @@ using namespace llvm;
STATISTIC(NumRemovedSExtW, "Number of removed sign-extensions");
STATISTIC(NumTransformedToWInstrs,
"Number of instructions transformed to W-ops");
+STATISTIC(NumTransformedToNonWInstrs,
+ "Number of instructions transformed to non-W-ops");
static cl::opt<bool> DisableSExtWRemoval("riscv-disable-sextw-removal",
cl::desc("Disable removal of sext.w"),
@@ -729,6 +731,7 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
for (MachineBasicBlock &MBB : MF) {
for (MachineInstr &MI : MBB) {
unsigned Opc;
+ // clang-format off
switch (MI.getOpcode()) {
default:
continue;
@@ -736,10 +739,15 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
case RISCV::ADDIW: Opc = RISCV::ADDI; break;
case RISCV::MULW: Opc = RISCV::MUL; break;
case RISCV::SLLIW: Opc = RISCV::SLLI; break;
+ case RISCV::SUBW: Opc = RISCV::SUB; break;
}
+ // clang-format on
if (hasAllWUsers(MI, ST, MRI)) {
+ LLVM_DEBUG(dbgs() << "Replacing " << MI);
MI.setDesc(TII.get(Opc));
+ LLVM_DEBUG(dbgs() << " with " << MI);
+ ++NumTransformedToNonWInstrs;
MadeChange = true;
}
}
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
index 4b999b892ed35..2a93585ea6876 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
@@ -66,7 +66,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
; RV64IM-NEXT: srli a2, a2, 32
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: srli a1, a1, 32
-; RV64IM-NEXT: subw a0, a0, a1
+; RV64IM-NEXT: sub a0, a0, a1
; RV64IM-NEXT: srliw a0, a0, 1
; RV64IM-NEXT: add a0, a0, a1
; RV64IM-NEXT: srliw a0, a0, 2
@@ -79,7 +79,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
; RV64IMZB-NEXT: zext.w a2, a0
; RV64IMZB-NEXT: mul a1, a2, a1
; RV64IMZB-NEXT: srli a1, a1, 32
-; RV64IMZB-NEXT: subw a0, a0, a1
+; RV64IMZB-NEXT: sub a0, a0, a1
; RV64IMZB-NEXT: srliw a0, a0, 1
; RV64IMZB-NEXT: add a0, a0, a1
; RV64IMZB-NEXT: srliw a0, a0, 2
@@ -250,7 +250,7 @@ define i8 @udiv8_constant_add(i8 %a) nounwind {
; RV64-NEXT: zext.b a2, a0
; RV64-NEXT: mul a1, a2, a1
; RV64-NEXT: srli a1, a1, 8
-; RV64-NEXT: subw a0, a0, a1
+; RV64-NEXT: sub a0, a0, a1
; RV64-NEXT: zext.b a0, a0
; RV64-NEXT: srli a0, a0, 1
; RV64-NEXT: add a0, a0, a1
@@ -816,7 +816,7 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: slli a1, a1, 48
; RV64IM-NEXT: srai a1, a1, 56
-; RV64IM-NEXT: subw a1, a1, a0
+; RV64IM-NEXT: sub a1, a1, a0
; RV64IM-NEXT: slli a1, a1, 56
; RV64IM-NEXT: srai a0, a1, 58
; RV64IM-NEXT: zext.b a1, a0
@@ -1071,7 +1071,7 @@ define i16 @sdiv16_constant_sub_srai(i16 %a) nounwind {
; RV64IM-NEXT: srai a2, a2, 48
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: sraiw a1, a1, 16
-; RV64IM-NEXT: subw a1, a1, a0
+; RV64IM-NEXT: sub a1, a1, a0
; RV64IM-NEXT: slli a1, a1, 48
; RV64IM-NEXT: srai a0, a1, 51
; RV64IM-NEXT: slli a1, a0, 48
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
index 8a786fc9993d2..46d1661983c6a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
@@ -29,7 +29,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotl_32:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sllw a1, a0, a1
; RV64I-NEXT: srlw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -55,7 +55,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_32:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sllw a1, a0, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -78,7 +78,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotr_32:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srlw a1, a0, a1
; RV64I-NEXT: sllw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -104,7 +104,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_32:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srlw a1, a0, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -167,7 +167,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotl_64:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sll a1, a0, a1
; RV64I-NEXT: srl a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -276,7 +276,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_64:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sll a1, a0, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -340,7 +340,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotr_64:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srl a1, a0, a1
; RV64I-NEXT: sll a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -451,7 +451,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_64:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srl a1, a0, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -474,7 +474,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotl_32_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sllw a1, a0, a1
; RV64I-NEXT: srlw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -490,7 +490,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64ZBB-LABEL: rotl_32_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: sllw a1, a0, a1
; RV64ZBB-NEXT: srlw a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -506,7 +506,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_32_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sllw a1, a0, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -531,7 +531,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64I-LABEL: rotl_32_mask_and_63_and_31:
; RV64I: # %bb.0:
; RV64I-NEXT: sllw a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: srlw a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -547,7 +547,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64ZBB-LABEL: rotl_32_mask_and_63_and_31:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: sllw a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: srlw a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -563,7 +563,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64XTHEADBB-LABEL: rotl_32_mask_and_63_and_31:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: sllw a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -632,7 +632,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotr_32_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srlw a1, a0, a1
; RV64I-NEXT: sllw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -648,7 +648,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64ZBB-LABEL: rotr_32_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: srlw a1, a0, a1
; RV64ZBB-NEXT: sllw a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -664,7 +664,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_32_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srlw a1, a0, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -689,7 +689,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64I-LABEL: rotr_32_mask_and_63_and_31:
; RV64I: # %bb.0:
; RV64I-NEXT: srlw a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: sllw a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -705,7 +705,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64ZBB-LABEL: rotr_32_mask_and_63_and_31:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: srlw a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: sllw a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -721,7 +721,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64XTHEADBB-LABEL: rotr_32_mask_and_63_and_31:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: srlw a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -829,7 +829,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotl_64_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sll a1, a0, a1
; RV64I-NEXT: srl a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -884,7 +884,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64ZBB-LABEL: rotl_64_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: sll a1, a0, a1
; RV64ZBB-NEXT: srl a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -939,7 +939,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_64_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sll a1, a0, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -1005,7 +1005,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64I-LABEL: rotl_64_mask_and_127_and_63:
; RV64I: # %bb.0:
; RV64I-NEXT: sll a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: srl a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -1062,7 +1062,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64ZBB-LABEL: rotl_64_mask_and_127_and_63:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: sll a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: srl a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -1119,7 +1119,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64XTHEADBB-LABEL: rotl_64_mask_and_127_and_63:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: sll a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -1277,7 +1277,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotr_64_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srl a1, a0, a1
; RV64I-NEXT: sll a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -1331,7 +1331,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64ZBB-LABEL: rotr_64_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: srl a1, a0, a1
; RV64ZBB-NEXT: sll a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -1385,7 +1385,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_64_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srl a1, a0, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -1451,7 +1451,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64I-LABEL: rotr_64_mask_and_127_and_63:
; RV64I: # %bb.0:
; RV64I-NEXT: srl a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: sll a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -1508,7 +1508,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64ZBB-LABEL: rotr_64_mask_and_127_and_63:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: srl a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: sll a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -1565,7 +1565,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64XTHEADBB-LABEL: rotr_64_mask_and_127_and_63:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: srl a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -1701,7 +1701,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: sllw a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srlw a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sllw a1, a1, a2
@@ -1737,7 +1737,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: sllw a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srlw a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sllw a1, a1, a2
@@ -1822,7 +1822,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: sll a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srl a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sll a1, a1, a2
@@ -1972,7 +1972,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: sll a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srl a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sll a1, a1, a2
@@ -2002,7 +2002,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: srlw a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sllw a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sllw a1, a1, a2
@@ -2038,7 +2038,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: srlw a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sllw a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sllw a1, a1, a2
@@ -2125,7 +2125,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: srl a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sll a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sll a1, a1, a2
@@ -2279,7 +2279,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: srl a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sll a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sll a1, a1, a2
@@ -2312,8 +2312,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: sllw a4, a0, a2
; RV64I-NEXT: sllw a2, a1, a2
-; RV64I-NEXT: negw a5, a3
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a5, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srlw a0, a0, a5
; RV64I-NEXT: srlw a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2353,8 +2353,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: sllw a4, a0, a2
; RV64XTHEADBB-NEXT: sllw a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a5, a3
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a5, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srlw a0, a0, a5
; RV64XTHEADBB-NEXT: srlw a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2464,7 +2464,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: sll a4, a0, a2
; RV64I-NEXT: sll a2, a1, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srl a0, a0, a3
; RV64I-NEXT: srl a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2664,7 +2664,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: sll a4, a0, a2
; RV64XTHEADBB-NEXT: sll a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srl a0, a0, a3
; RV64XTHEADBB-NEXT: srl a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2697,8 +2697,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: srlw a4, a0, a2
; RV64I-NEXT: srlw a2, a1, a2
-; RV64I-NEXT: negw a5, a3
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a5, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sllw a0, a0, a5
; RV64I-NEXT: sllw a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2738,8 +2738,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: srlw a4, a0, a2
; RV64XTHEADBB-NEXT: srlw a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a5, a3
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a5, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sllw a0, a0, a5
; RV64XTHEADBB-NEXT: sllw a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2850,7 +2850,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: srl a4, a0, a2
; RV64I-NEXT: srl a2, a1, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sll a0, a0, a3
; RV64I-NEXT: sll a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -3052,7 +3052,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: srl a4, a0, a2
; RV64XTHEADBB-NEXT: srl ...
[truncated]
|
|
@llvm/pr-subscribers-llvm-globalisel Author: Alex Bradbury (asb) ChangesThis is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included. Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it. Patch is 165.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/149071.diff 53 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index 28d64031f8bcb..09a31fb2306de 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -48,6 +48,8 @@ using namespace llvm;
STATISTIC(NumRemovedSExtW, "Number of removed sign-extensions");
STATISTIC(NumTransformedToWInstrs,
"Number of instructions transformed to W-ops");
+STATISTIC(NumTransformedToNonWInstrs,
+ "Number of instructions transformed to non-W-ops");
static cl::opt<bool> DisableSExtWRemoval("riscv-disable-sextw-removal",
cl::desc("Disable removal of sext.w"),
@@ -729,6 +731,7 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
for (MachineBasicBlock &MBB : MF) {
for (MachineInstr &MI : MBB) {
unsigned Opc;
+ // clang-format off
switch (MI.getOpcode()) {
default:
continue;
@@ -736,10 +739,15 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
case RISCV::ADDIW: Opc = RISCV::ADDI; break;
case RISCV::MULW: Opc = RISCV::MUL; break;
case RISCV::SLLIW: Opc = RISCV::SLLI; break;
+ case RISCV::SUBW: Opc = RISCV::SUB; break;
}
+ // clang-format on
if (hasAllWUsers(MI, ST, MRI)) {
+ LLVM_DEBUG(dbgs() << "Replacing " << MI);
MI.setDesc(TII.get(Opc));
+ LLVM_DEBUG(dbgs() << " with " << MI);
+ ++NumTransformedToNonWInstrs;
MadeChange = true;
}
}
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
index 4b999b892ed35..2a93585ea6876 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/div-by-constant.ll
@@ -66,7 +66,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
; RV64IM-NEXT: srli a2, a2, 32
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: srli a1, a1, 32
-; RV64IM-NEXT: subw a0, a0, a1
+; RV64IM-NEXT: sub a0, a0, a1
; RV64IM-NEXT: srliw a0, a0, 1
; RV64IM-NEXT: add a0, a0, a1
; RV64IM-NEXT: srliw a0, a0, 2
@@ -79,7 +79,7 @@ define i32 @udiv_constant_add(i32 %a) nounwind {
; RV64IMZB-NEXT: zext.w a2, a0
; RV64IMZB-NEXT: mul a1, a2, a1
; RV64IMZB-NEXT: srli a1, a1, 32
-; RV64IMZB-NEXT: subw a0, a0, a1
+; RV64IMZB-NEXT: sub a0, a0, a1
; RV64IMZB-NEXT: srliw a0, a0, 1
; RV64IMZB-NEXT: add a0, a0, a1
; RV64IMZB-NEXT: srliw a0, a0, 2
@@ -250,7 +250,7 @@ define i8 @udiv8_constant_add(i8 %a) nounwind {
; RV64-NEXT: zext.b a2, a0
; RV64-NEXT: mul a1, a2, a1
; RV64-NEXT: srli a1, a1, 8
-; RV64-NEXT: subw a0, a0, a1
+; RV64-NEXT: sub a0, a0, a1
; RV64-NEXT: zext.b a0, a0
; RV64-NEXT: srli a0, a0, 1
; RV64-NEXT: add a0, a0, a1
@@ -816,7 +816,7 @@ define i8 @sdiv8_constant_sub_srai(i8 %a) nounwind {
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: slli a1, a1, 48
; RV64IM-NEXT: srai a1, a1, 56
-; RV64IM-NEXT: subw a1, a1, a0
+; RV64IM-NEXT: sub a1, a1, a0
; RV64IM-NEXT: slli a1, a1, 56
; RV64IM-NEXT: srai a0, a1, 58
; RV64IM-NEXT: zext.b a1, a0
@@ -1071,7 +1071,7 @@ define i16 @sdiv16_constant_sub_srai(i16 %a) nounwind {
; RV64IM-NEXT: srai a2, a2, 48
; RV64IM-NEXT: mul a1, a2, a1
; RV64IM-NEXT: sraiw a1, a1, 16
-; RV64IM-NEXT: subw a1, a1, a0
+; RV64IM-NEXT: sub a1, a1, a0
; RV64IM-NEXT: slli a1, a1, 48
; RV64IM-NEXT: srai a0, a1, 51
; RV64IM-NEXT: slli a1, a0, 48
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
index 8a786fc9993d2..46d1661983c6a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rotl-rotr.ll
@@ -29,7 +29,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotl_32:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sllw a1, a0, a1
; RV64I-NEXT: srlw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -55,7 +55,7 @@ define i32 @rotl_32(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_32:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sllw a1, a0, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -78,7 +78,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotr_32:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srlw a1, a0, a1
; RV64I-NEXT: sllw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -104,7 +104,7 @@ define i32 @rotr_32(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_32:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srlw a1, a0, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -167,7 +167,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotl_64:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sll a1, a0, a1
; RV64I-NEXT: srl a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -276,7 +276,7 @@ define i64 @rotl_64(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_64:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sll a1, a0, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -340,7 +340,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotr_64:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srl a1, a0, a1
; RV64I-NEXT: sll a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -451,7 +451,7 @@ define i64 @rotr_64(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_64:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srl a1, a0, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -474,7 +474,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotl_32_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sllw a1, a0, a1
; RV64I-NEXT: srlw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -490,7 +490,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64ZBB-LABEL: rotl_32_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: sllw a1, a0, a1
; RV64ZBB-NEXT: srlw a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -506,7 +506,7 @@ define i32 @rotl_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_32_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sllw a1, a0, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -531,7 +531,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64I-LABEL: rotl_32_mask_and_63_and_31:
; RV64I: # %bb.0:
; RV64I-NEXT: sllw a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: srlw a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -547,7 +547,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64ZBB-LABEL: rotl_32_mask_and_63_and_31:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: sllw a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: srlw a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -563,7 +563,7 @@ define i32 @rotl_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64XTHEADBB-LABEL: rotl_32_mask_and_63_and_31:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: sllw a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: srlw a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -632,7 +632,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64I-LABEL: rotr_32_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srlw a1, a0, a1
; RV64I-NEXT: sllw a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -648,7 +648,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64ZBB-LABEL: rotr_32_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: srlw a1, a0, a1
; RV64ZBB-NEXT: sllw a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -664,7 +664,7 @@ define i32 @rotr_32_mask(i32 %x, i32 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_32_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srlw a1, a0, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -689,7 +689,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64I-LABEL: rotr_32_mask_and_63_and_31:
; RV64I: # %bb.0:
; RV64I-NEXT: srlw a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: sllw a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -705,7 +705,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64ZBB-LABEL: rotr_32_mask_and_63_and_31:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: srlw a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: sllw a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -721,7 +721,7 @@ define i32 @rotr_32_mask_and_63_and_31(i32 %x, i32 %y) nounwind {
; RV64XTHEADBB-LABEL: rotr_32_mask_and_63_and_31:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: srlw a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: sllw a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -829,7 +829,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotl_64_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: sll a1, a0, a1
; RV64I-NEXT: srl a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -884,7 +884,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64ZBB-LABEL: rotl_64_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: sll a1, a0, a1
; RV64ZBB-NEXT: srl a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -939,7 +939,7 @@ define i64 @rotl_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotl_64_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: sll a1, a0, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -1005,7 +1005,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64I-LABEL: rotl_64_mask_and_127_and_63:
; RV64I: # %bb.0:
; RV64I-NEXT: sll a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: srl a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -1062,7 +1062,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64ZBB-LABEL: rotl_64_mask_and_127_and_63:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: sll a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: srl a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -1119,7 +1119,7 @@ define i64 @rotl_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64XTHEADBB-LABEL: rotl_64_mask_and_127_and_63:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: sll a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: srl a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -1277,7 +1277,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64I-LABEL: rotr_64_mask:
; RV64I: # %bb.0:
-; RV64I-NEXT: negw a2, a1
+; RV64I-NEXT: neg a2, a1
; RV64I-NEXT: srl a1, a0, a1
; RV64I-NEXT: sll a0, a0, a2
; RV64I-NEXT: or a0, a1, a0
@@ -1331,7 +1331,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64ZBB-LABEL: rotr_64_mask:
; RV64ZBB: # %bb.0:
-; RV64ZBB-NEXT: negw a2, a1
+; RV64ZBB-NEXT: neg a2, a1
; RV64ZBB-NEXT: srl a1, a0, a1
; RV64ZBB-NEXT: sll a0, a0, a2
; RV64ZBB-NEXT: or a0, a1, a0
@@ -1385,7 +1385,7 @@ define i64 @rotr_64_mask(i64 %x, i64 %y) nounwind {
;
; RV64XTHEADBB-LABEL: rotr_64_mask:
; RV64XTHEADBB: # %bb.0:
-; RV64XTHEADBB-NEXT: negw a2, a1
+; RV64XTHEADBB-NEXT: neg a2, a1
; RV64XTHEADBB-NEXT: srl a1, a0, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a2
; RV64XTHEADBB-NEXT: or a0, a1, a0
@@ -1451,7 +1451,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64I-LABEL: rotr_64_mask_and_127_and_63:
; RV64I: # %bb.0:
; RV64I-NEXT: srl a2, a0, a1
-; RV64I-NEXT: negw a1, a1
+; RV64I-NEXT: neg a1, a1
; RV64I-NEXT: sll a0, a0, a1
; RV64I-NEXT: or a0, a2, a0
; RV64I-NEXT: ret
@@ -1508,7 +1508,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64ZBB-LABEL: rotr_64_mask_and_127_and_63:
; RV64ZBB: # %bb.0:
; RV64ZBB-NEXT: srl a2, a0, a1
-; RV64ZBB-NEXT: negw a1, a1
+; RV64ZBB-NEXT: neg a1, a1
; RV64ZBB-NEXT: sll a0, a0, a1
; RV64ZBB-NEXT: or a0, a2, a0
; RV64ZBB-NEXT: ret
@@ -1565,7 +1565,7 @@ define i64 @rotr_64_mask_and_127_and_63(i64 %x, i64 %y) nounwind {
; RV64XTHEADBB-LABEL: rotr_64_mask_and_127_and_63:
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: srl a2, a0, a1
-; RV64XTHEADBB-NEXT: negw a1, a1
+; RV64XTHEADBB-NEXT: neg a1, a1
; RV64XTHEADBB-NEXT: sll a0, a0, a1
; RV64XTHEADBB-NEXT: or a0, a2, a0
; RV64XTHEADBB-NEXT: ret
@@ -1701,7 +1701,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: sllw a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srlw a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sllw a1, a1, a2
@@ -1737,7 +1737,7 @@ define signext i32 @rotl_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: sllw a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srlw a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sllw a1, a1, a2
@@ -1822,7 +1822,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: sll a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srl a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sll a1, a1, a2
@@ -1972,7 +1972,7 @@ define signext i64 @rotl_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: sll a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srl a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sll a1, a1, a2
@@ -2002,7 +2002,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: srlw a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sllw a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sllw a1, a1, a2
@@ -2038,7 +2038,7 @@ define signext i32 @rotr_32_mask_shared(i32 signext %a, i32 signext %b, i32 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: srlw a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sllw a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sllw a1, a1, a2
@@ -2125,7 +2125,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64I: # %bb.0:
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: srl a4, a0, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sll a0, a0, a3
; RV64I-NEXT: or a0, a4, a0
; RV64I-NEXT: sll a1, a1, a2
@@ -2279,7 +2279,7 @@ define signext i64 @rotr_64_mask_shared(i64 signext %a, i64 signext %b, i64 sign
; RV64XTHEADBB: # %bb.0:
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: srl a4, a0, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sll a0, a0, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
; RV64XTHEADBB-NEXT: sll a1, a1, a2
@@ -2312,8 +2312,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: sllw a4, a0, a2
; RV64I-NEXT: sllw a2, a1, a2
-; RV64I-NEXT: negw a5, a3
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a5, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srlw a0, a0, a5
; RV64I-NEXT: srlw a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2353,8 +2353,8 @@ define signext i32 @rotl_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: sllw a4, a0, a2
; RV64XTHEADBB-NEXT: sllw a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a5, a3
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a5, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srlw a0, a0, a5
; RV64XTHEADBB-NEXT: srlw a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2464,7 +2464,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: sll a4, a0, a2
; RV64I-NEXT: sll a2, a1, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: srl a0, a0, a3
; RV64I-NEXT: srl a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2664,7 +2664,7 @@ define i64 @rotl_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: sll a4, a0, a2
; RV64XTHEADBB-NEXT: sll a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: srl a0, a0, a3
; RV64XTHEADBB-NEXT: srl a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2697,8 +2697,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64I-NEXT: andi a3, a2, 31
; RV64I-NEXT: srlw a4, a0, a2
; RV64I-NEXT: srlw a2, a1, a2
-; RV64I-NEXT: negw a5, a3
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a5, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sllw a0, a0, a5
; RV64I-NEXT: sllw a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -2738,8 +2738,8 @@ define signext i32 @rotr_32_mask_multiple(i32 signext %a, i32 signext %b, i32 si
; RV64XTHEADBB-NEXT: andi a3, a2, 31
; RV64XTHEADBB-NEXT: srlw a4, a0, a2
; RV64XTHEADBB-NEXT: srlw a2, a1, a2
-; RV64XTHEADBB-NEXT: negw a5, a3
-; RV64XTHEADBB-NEXT: negw a3, a3
+; RV64XTHEADBB-NEXT: neg a5, a3
+; RV64XTHEADBB-NEXT: neg a3, a3
; RV64XTHEADBB-NEXT: sllw a0, a0, a5
; RV64XTHEADBB-NEXT: sllw a1, a1, a3
; RV64XTHEADBB-NEXT: or a0, a4, a0
@@ -2850,7 +2850,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64I-NEXT: andi a3, a2, 63
; RV64I-NEXT: srl a4, a0, a2
; RV64I-NEXT: srl a2, a1, a2
-; RV64I-NEXT: negw a3, a3
+; RV64I-NEXT: neg a3, a3
; RV64I-NEXT: sll a0, a0, a3
; RV64I-NEXT: sll a1, a1, a3
; RV64I-NEXT: or a0, a4, a0
@@ -3052,7 +3052,7 @@ define i64 @rotr_64_mask_multiple(i64 %a, i64 %b, i64 %amt) nounwind {
; RV64XTHEADBB-NEXT: andi a3, a2, 63
; RV64XTHEADBB-NEXT: srl a4, a0, a2
; RV64XTHEADBB-NEXT: srl ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ffixes-additional-opcodes
…trs extend debug printing RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have one for the W=>Non-W transform done by stripWSuffixes. It also didn't do debug printing of the transformation. This patch addresses both issues. Reviewed as part of <#149071>, but landing separately.
…ffixes-additional-opcodes
…ISCVOptWInstrs extend debug printing RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have one for the W=>Non-W transform done by stripWSuffixes. It also didn't do debug printing of the transformation. This patch addresses both issues. Reviewed as part of <llvm/llvm-project#149071>, but landing separately.
…trs extend debug printing RISCVOptWInstrs has a NumTransformedToWInstrs statistic, but didn't have one for the W=>Non-W transform done by stripWSuffixes. It also didn't do debug printing of the transformation. This patch addresses both issues. Reviewed as part of <llvm#149071>, but landing separately.
This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.
This is purely a benefit for reducing unnecessary diffs between RV32 and RV64, as RVC does have a compressed form of SUBW (so SUB isn't more compressible). This affects ~57.2k instructions in an rva22u64 build of llvm-test-suite with SPEC CPU 2017 included.
Note this PR also includes a trivial commit that adds a debug print and a stat counter to stripWSuffixes. I don't think this is worth a separate PR, but speak up if you have any comments on it.