-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[RISCV] Implement CodeGen Support for XCValu Extension in CV32E40P #78138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-risc-v Author: None (realqhc) Changes… in CV32E40P Implement XCValu intrinsics and CodeGen for CV32E40P according to the specification. This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins. Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie. Patch is 32.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78138.diff 5 Files Affected:
diff --git a/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td b/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
index f1590ad66e362b..f0c6faadf6aabf 100644
--- a/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
+++ b/llvm/include/llvm/IR/IntrinsicsRISCVXCV.td
@@ -18,6 +18,18 @@ class ScalarCoreVBitManipGprIntrinsic
: DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty],
[IntrNoMem, IntrSpeculatable]>;
+class ScalarCoreVAluGprIntrinsic
+ : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty],
+ [IntrNoMem, IntrSpeculatable]>;
+
+class ScalarCoreVAluGprGprIntrinsic
+ : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty],
+ [IntrNoMem, IntrSpeculatable]>;
+
+class ScalarCoreVAluGprGprGprIntrinsic
+ : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
+ [IntrNoMem, IntrSpeculatable]>;
+
let TargetPrefix = "riscv" in {
def int_riscv_cv_bitmanip_extract : ScalarCoreVBitManipGprGprIntrinsic;
def int_riscv_cv_bitmanip_extractu : ScalarCoreVBitManipGprGprIntrinsic;
@@ -34,4 +46,20 @@ let TargetPrefix = "riscv" in {
: DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
[IntrNoMem, IntrWillReturn, IntrSpeculatable,
ImmArg<ArgIndex<1>>, ImmArg<ArgIndex<2>>]>;
+
+ def int_riscv_cv_alu_exths : ScalarCoreVAluGprIntrinsic;
+ def int_riscv_cv_alu_exthz : ScalarCoreVAluGprIntrinsic;
+ def int_riscv_cv_alu_extbs : ScalarCoreVAluGprIntrinsic;
+ def int_riscv_cv_alu_extbz : ScalarCoreVAluGprIntrinsic;
+
+ def int_riscv_cv_alu_clip : ScalarCoreVAluGprGprIntrinsic;
+ def int_riscv_cv_alu_clipu : ScalarCoreVAluGprGprIntrinsic;
+ def int_riscv_cv_alu_addn : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_addun : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_addrn : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_addurn : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_subn : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_subun : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_subrn : ScalarCoreVAluGprGprGprIntrinsic;
+ def int_riscv_cv_alu_suburn : ScalarCoreVAluGprGprGprIntrinsic;
} // TargetPrefix = "riscv"
diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index ed2b1ceb7d6f0d..aaac5ce834dd4b 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -53,6 +53,8 @@ class RISCVExpandPseudo : public MachineFunctionPass {
MachineBasicBlock::iterator MBBI);
bool expandRV32ZdinxLoad(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI);
+ bool expandCoreVClip(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
+ bool expandCoreVAddSub(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI);
#ifndef NDEBUG
unsigned getInstSizeInBytes(const MachineFunction &MF) const {
unsigned Size = 0;
@@ -161,6 +163,16 @@ bool RISCVExpandPseudo::expandMI(MachineBasicBlock &MBB,
case RISCV::PseudoVMSET_M_B64:
// vmset.m vd => vmxnor.mm vd, vd, vd
return expandVMSET_VMCLR(MBB, MBBI, RISCV::VMXNOR_MM);
+ case RISCV::CV_CLIP_PSEUDO:
+ case RISCV::CV_CLIPU_PSEUDO:return expandCoreVClip(MBB, MBBI);
+ case RISCV::CV_ADDN_PSEUDO:
+ case RISCV::CV_ADDUN_PSEUDO:
+ case RISCV::CV_ADDRN_PSEUDO:
+ case RISCV::CV_ADDURN_PSEUDO:
+ case RISCV::CV_SUBN_PSEUDO:
+ case RISCV::CV_SUBUN_PSEUDO:
+ case RISCV::CV_SUBRN_PSEUDO:
+ case RISCV::CV_SUBURN_PSEUDO:return expandCoreVAddSub(MBB, MBBI);
}
return false;
@@ -547,6 +559,77 @@ bool RISCVPreRAExpandPseudo::expandLoadTLSGDAddress(
RISCV::ADDI);
}
+bool RISCVExpandPseudo::expandCoreVClip(llvm::MachineBasicBlock &MBB,
+ MachineBasicBlock::iterator MBBI) {
+ DebugLoc DL = MBBI->getDebugLoc();
+ Register DstReg = MBBI->getOperand(0).getReg();
+ Register I = MBBI->getOperand(1).getReg();
+ uint64_t J = MBBI->getOperand(2).getImm();
+
+ unsigned Opcode = MBBI->getOpcode() == RISCV::CV_CLIPU_PSEUDO ?
+ RISCV::CV_CLIPU : RISCV::CV_CLIP;
+ const MCInstrDesc &Desc = TII->get(Opcode);
+ BuildMI(MBB, MBBI, DL, Desc, DstReg)
+ .addReg(I)
+ .addImm(Log2_32_Ceil(J + 1) + 1);
+ MBBI->eraseFromParent();
+ return true;
+}
+
+bool RISCVExpandPseudo::expandCoreVAddSub(llvm::MachineBasicBlock &MBB,
+ MachineBasicBlock::iterator MBBI) {
+ auto *MRI = &MBB.getParent()->getRegInfo();
+ DebugLoc DL = MBBI->getDebugLoc();
+ Register DstReg = MBBI->getOperand(0).getReg();
+ Register X = MBBI->getOperand(1).getReg();
+ Register Y = MBBI->getOperand(2).getReg();
+ uint8_t Shift = MBBI->getOperand(3).getImm();
+
+ bool IsImm = 0 <= Shift && Shift <= 31;
+ unsigned Opcode;
+ switch (MBBI->getOpcode()) {
+ case RISCV::CV_ADDN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_ADDN : RISCV::CV_ADDNR;
+ break;
+ case RISCV::CV_ADDUN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_ADDUN : RISCV::CV_ADDUNR;
+ break;
+ case RISCV::CV_ADDRN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_ADDRN : RISCV::CV_ADDRNR;
+ break;
+ case RISCV::CV_ADDURN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_ADDURN : RISCV::CV_ADDURNR;
+ break;
+ case RISCV::CV_SUBN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_SUBN : RISCV::CV_SUBNR;
+ break;
+ case RISCV::CV_SUBUN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_SUBUN : RISCV::CV_SUBUNR;
+ break;
+ case RISCV::CV_SUBRN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_SUBRN : RISCV::CV_SUBRNR;
+ break;
+ case RISCV::CV_SUBURN_PSEUDO:
+ Opcode = IsImm ? RISCV::CV_SUBURN : RISCV::CV_SUBURNR;
+ break;
+ default:llvm_unreachable("unknown instruction");
+ }
+ const MCInstrDesc &Desc = TII->get(Opcode);
+ if (IsImm) {
+ BuildMI(MBB, MBBI, DL, Desc, DstReg).
+ addReg(X).
+ addReg(Y).
+ addImm(Shift);
+ } else {
+ MRI->replaceRegWith(DstReg, X);
+ BuildMI(MBB, MBBI, DL, Desc, DstReg).
+ addReg(Y).
+ addReg(DstReg);
+ }
+ MBBI->eraseFromParent();
+ return true;
+}
+
} // end of anonymous namespace
INITIALIZE_PASS(RISCVExpandPseudo, "riscv-expand-pseudo",
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index cb9ffabc41236e..0979b40af768ed 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -250,10 +250,12 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
if (RV64LegalI32 && Subtarget.is64Bit())
setOperationAction(ISD::SELECT_CC, MVT::i32, Expand);
- setCondCodeAction(ISD::SETLE, XLenVT, Expand);
+ if (!Subtarget.hasVendorXCValu())
+ setCondCodeAction(ISD::SETLE, XLenVT, Expand);
setCondCodeAction(ISD::SETGT, XLenVT, Custom);
setCondCodeAction(ISD::SETGE, XLenVT, Expand);
- setCondCodeAction(ISD::SETULE, XLenVT, Expand);
+ if (!Subtarget.hasVendorXCValu())
+ setCondCodeAction(ISD::SETULE, XLenVT, Expand);
setCondCodeAction(ISD::SETUGT, XLenVT, Custom);
setCondCodeAction(ISD::SETUGE, XLenVT, Expand);
@@ -1366,6 +1368,16 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
}
}
+ if (Subtarget.hasVendorXCValu()) {
+ setOperationAction(ISD::ABS, XLenVT, Legal);
+ setOperationAction(ISD::SMIN, XLenVT, Legal);
+ setOperationAction(ISD::UMIN, XLenVT, Legal);
+ setOperationAction(ISD::SMAX, XLenVT, Legal);
+ setOperationAction(ISD::UMAX, XLenVT, Legal);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i8, Legal);
+ setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16, Legal);
+ }
+
// Function alignments.
const Align FunctionAlignment(Subtarget.hasStdExtCOrZca() ? 2 : 4);
setMinFunctionAlignment(FunctionAlignment);
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
index 924e91e15c348f..e0aeaf8c5c5f7c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXCV.td
@@ -198,7 +198,7 @@ let DecoderNamespace = "XCValu" in {
} // DecoderNamespace = "XCValu"
-let Predicates = [HasVendorXCValu],
+let Predicates = [HasVendorXCValu, IsRV32],
hasSideEffects = 0, mayLoad = 0, mayStore = 0 in {
// General ALU Operations
def CV_ABS : CVInstAluR<0b0101000, 0b011, "cv.abs">,
@@ -249,10 +249,10 @@ let Predicates = [HasVendorXCValu],
Sched<[]>;
def CV_SUBURN : CVInstAluRRI<0b11, 0b011, "cv.suburn">,
Sched<[]>;
-} // Predicates = [HasVendorXCValu],
+} // Predicates = [HasVendorXCValu, IsRV32],
// hasSideEffects = 0, mayLoad = 0, mayStore = 0
-let Predicates = [HasVendorXCValu],
+let Predicates = [HasVendorXCValu, IsRV32],
hasSideEffects = 0, mayLoad = 0, mayStore = 0,
Constraints = "$rd = $rd_wb" in {
def CV_ADDNR : CVInstAluRRNR<0b1000000, 0b011, "cv.addnr">,
@@ -272,7 +272,7 @@ let Predicates = [HasVendorXCValu],
def CV_SUBURNR : CVInstAluRRNR<0b1000111, 0b011, "cv.suburnr">,
Sched<[]>;
-} // Predicates = [HasVendorXCValu],
+} // Predicates = [HasVendorXCValu, IsRV32],
// hasSideEffects = 0, mayLoad = 0, mayStore = 0,
// Constraints = "$rd = $rd_wb"
@@ -662,6 +662,8 @@ let Predicates = [HasVendorXCVelw, IsRV32], hasSideEffects = 0,
def cv_tuimm2 : TImmLeaf<XLenVT, [{return isUInt<2>(Imm);}]>;
def cv_tuimm5 : TImmLeaf<XLenVT, [{return isUInt<5>(Imm);}]>;
def cv_uimm10 : ImmLeaf<XLenVT, [{return isUInt<10>(Imm);}]>;
+def cv_uimm32: Operand<XLenVT>,
+ ImmLeaf<XLenVT, [{return isPowerOf2_32(Imm + 1);}]>;
def CV_LO5: SDNodeXForm<imm, [{
return CurDAG->getTargetConstant(N->getZExtValue() & 0x1f, SDLoc(N),
@@ -673,6 +675,49 @@ def CV_HI5: SDNodeXForm<imm, [{
N->getValueType(0));
}]>;
+def between : PatFrags<(ops node:$lowerBound, node:$upperBound, node:$value),
+ [(smin (smax node:$value, node:$lowerBound), node:$upperBound),
+ (smax (smin node:$value, node:$upperBound), node:$lowerBound)]>;
+
+def betweenu : PatFrags<(ops node:$upperBound, node:$value),
+ [(smin (smax node:$value, 0), node:$upperBound),
+ (smax (smin node:$value, node:$upperBound), 0)]>;
+def powerOf2 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(Imm); }]>;
+def powerOf2Minus1 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(Imm+1); }]>;
+def negativePowerOf2 : ImmLeaf<XLenVT, [{ return isPowerOf2_32(-Imm); }]>;
+def roundBit : PatFrag<(ops node:$shiftAmount),
+ (srl (shl 1, node:$shiftAmount), (XLenVT 1))>;
+def trailing1sPlus1 : SDNodeXForm<imm, [{
+ return CurDAG->getTargetConstant(
+ llvm::countr_one(N->getZExtValue()) + 1,
+ SDLoc(N), N->getValueType(0));
+}]>;
+
+def shiftRound : PatFrag<(ops node:$value, node:$shiftAmount),
+ (sra (add node:$value, powerOf2), node:$shiftAmount), [{
+ if (auto powerOf2 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1)))
+ return (powerOf2->getZExtValue() << 1) == (1U << N->getConstantOperandVal(1));
+ return false;
+}]>;
+
+def ushiftRound : PatFrag<(ops node:$value, node:$shiftAmount),
+ (srl (add node:$value, powerOf2), node:$shiftAmount), [{
+ if (auto powerOf2 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1)))
+ return (powerOf2->getZExtValue() << 1) == (1U << N->getConstantOperandVal(1));
+ return false;
+}]>;
+
+def clip : PatFrag<(ops node:$upperBound, node:$value),
+ (between negativePowerOf2, node:$upperBound, node:$value), [{
+ // Checking lower & upper bound for the clip instruction
+ if (auto bound1 = dyn_cast<ConstantSDNode>(N->getOperand(0)->getOperand(1))) {
+ if (auto bound2 = dyn_cast<ConstantSDNode>(N->getOperand(1))) {
+ return (bound1->getSExtValue() == ~bound2->getSExtValue());
+ }
+ }
+ return false;
+}]>;
+
multiclass PatCoreVBitManip<Intrinsic intr> {
def : PatGprGpr<intr, !cast<RVInst>("CV_" # NAME # "R")>;
def : Pat<(intr GPR:$rs1, cv_uimm10:$imm),
@@ -704,3 +749,112 @@ let Predicates = [HasVendorXCVbitmanip, IsRV32] in {
(CV_BITREV GPR:$rs1, cv_tuimm2:$radix, cv_tuimm5:$pts)>;
def : Pat<(bitreverse (XLenVT GPR:$rs)), (CV_BITREV GPR:$rs, 0, 0)>;
}
+
+class PatCoreVAluGpr <string intr, string asm> :
+ PatGpr<!cast<Intrinsic>("int_riscv_cv_alu_" # intr),
+ !cast<RVInst>("CV_" # asm)>;
+class PatCoreVAluGprGpr <string intr, string asm> :
+ PatGprGpr<!cast<Intrinsic>("int_riscv_cv_alu_" # intr),
+ !cast<RVInst>("CV_" # asm)>;
+
+multiclass PatCoreVAluGprImm <Intrinsic intr> {
+ def "CV_" # NAME # "_PSEUDO" :
+ Pseudo<(outs GPR:$rd), (ins GPR:$rs, cv_uimm32:$imm), []>;
+ def : PatGprGpr<intr, !cast<RVInst>("CV_" # NAME # "R")>;
+ def : PatGprImm<intr, !cast<RVInst>("CV_" # NAME # "_PSEUDO"), cv_uimm32>;
+}
+
+multiclass PatCoreVAluGprGprImm <Intrinsic intr> {
+ def "CV_" # NAME # "_PSEUDO" :
+ Pseudo<(outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2, uimm5:$imm), []>;
+ def : Pat<(intr GPR:$rs1, GPR:$rs2, GPR:$rs3),
+ (!cast<RVInst>("CV_" # NAME # "R") GPR:$rs1, GPR:$rs2, GPR:$rs3)>;
+ def : Pat<(intr GPR:$rs1, GPR:$rs2, uimm5:$imm),
+ (!cast<RVInst>("CV_" # NAME # "_PSEUDO") GPR:$rs1, GPR:$rs2,
+ uimm5:$imm)>;
+}
+
+let Predicates = [HasVendorXCValu, IsRV32], AddedComplexity = 1 in {
+ def : PatGpr<abs, CV_ABS>;
+ def : PatGprGpr<setle, CV_SLET>;
+ def : PatGprGpr<setule, CV_SLETU>;
+ def : PatGprGpr<smin, CV_MIN>;
+ def : PatGprGpr<umin, CV_MINU>;
+ def : PatGprGpr<smax, CV_MAX>;
+ def : PatGprGpr<umax, CV_MAXU>;
+
+ def : Pat<(sext_inreg (XLenVT GPR:$rs1), i16), (CV_EXTHS GPR:$rs1)>;
+ def : Pat<(sext_inreg (XLenVT GPR:$rs1), i8), (CV_EXTBS GPR:$rs1)>;
+
+ def : Pat<(and (XLenVT GPR:$rs1), 0xffff), (CV_EXTHZ GPR:$rs1)>;
+ def : Pat<(and (XLenVT GPR:$rs1), 0xff), (CV_EXTBZ GPR:$rs1)>;
+
+ def : Pat<(clip powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)),
+ (CV_CLIP GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>;
+ def : Pat<(between (not GPR:$rs2), GPR:$rs2, (XLenVT GPR:$rs1)),
+ (CV_CLIPR GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(betweenu powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)),
+ (CV_CLIPU GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>;
+ def : Pat<(betweenu GPR:$rs2, (XLenVT GPR:$rs1)),
+ (CV_CLIPUR GPR:$rs1, GPR:$rs2)>;
+
+ def : Pat<(sra (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+ (CV_ADDN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(srl (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+ (CV_ADDUN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(shiftRound (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+ uimm5:$imm5),
+ (CV_ADDRN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(ushiftRound (add (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+ uimm5:$imm5),
+ (CV_ADDURN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+
+ def : Pat<(sra (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+ (CV_SUBN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(srl (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)), uimm5:$imm5),
+ (CV_SUBUN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(shiftRound (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+ uimm5:$imm5),
+ (CV_SUBRN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+ def : Pat<(ushiftRound (sub (XLenVT GPR:$rs1), (XLenVT GPR:$rs2)),
+ uimm5:$imm5),
+ (CV_SUBURN GPR:$rs1, GPR:$rs2, uimm5:$imm5)>;
+
+ def : Pat<(sra (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+ (CV_ADDNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(srl (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+ (CV_ADDUNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(sra (add (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+ (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+ (CV_ADDRNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(srl (add (add (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+ (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+ (CV_ADDURNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+
+ def : Pat<(sra (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+ (CV_SUBNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(srl (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)), (XLenVT GPR:$rs2)),
+ (CV_SUBUNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(sra (add (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+ (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+ (CV_SUBRNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+ def : Pat<(srl (add (sub (XLenVT GPR:$rd), (XLenVT GPR:$rs1)),
+ (roundBit (XLenVT GPR:$rs2))), (XLenVT GPR:$rs2)),
+ (CV_SUBURNR GPR:$rd, GPR:$rs1, GPR:$rs2)>;
+
+ def : PatCoreVAluGpr<"exths", "EXTHS">;
+ def : PatCoreVAluGpr<"exthz", "EXTHZ">;
+ def : PatCoreVAluGpr<"extbs", "EXTBS">;
+ def : PatCoreVAluGpr<"extbz", "EXTBZ">;
+
+ defm CLIP : PatCoreVAluGprImm<int_riscv_cv_alu_clip>;
+ defm CLIPU : PatCoreVAluGprImm<int_riscv_cv_alu_clipu>;
+ defm ADDN : PatCoreVAluGprGprImm<int_riscv_cv_alu_addn>;
+ defm ADDUN : PatCoreVAluGprGprImm<int_riscv_cv_alu_addun>;
+ defm ADDRN : PatCoreVAluGprGprImm<int_riscv_cv_alu_addrn>;
+ defm ADDURN : PatCoreVAluGprGprImm<int_riscv_cv_alu_addurn>;
+ defm SUBN : PatCoreVAluGprGprImm<int_riscv_cv_alu_subn>;
+ defm SUBUN : PatCoreVAluGprGprImm<int_riscv_cv_alu_subun>;
+ defm SUBRN : PatCoreVAluGprGprImm<int_riscv_cv_alu_subrn>;
+ defm SUBURN : PatCoreVAluGprGprImm<int_riscv_cv_alu_suburn>;
+} // Predicates = [HasVendorXCValu, IsRV32]
diff --git a/llvm/test/CodeGen/RISCV/xcvalu.ll b/llvm/test/CodeGen/RISCV/xcvalu.ll
new file mode 100644
index 00000000000000..3b83b32a672c09
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/xcvalu.ll
@@ -0,0 +1,583 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -O0 -mtriple=riscv32 -mattr=+m -mattr=+xcvalu -verify-machineinstrs < %s \
+; RUN: | FileCheck %s
+
+declare i32 @llvm.abs.i32(i32, i1)
+declare i32 @llvm.smin.i32(i32, i32)
+declare i32 @llvm.smax.i32(i32, i32)
+declare i32 @llvm.umin.i32(i32, i32)
+declare i32 @llvm.umax.i32(i32, i32)
+
+define i32 @abs(i32 %a) {
+; CHECK-LABEL: abs:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.abs a0, a0
+; CHECK-NEXT: ret
+ %1 = call i32 @llvm.abs.i32(i32 %a, i1 false)
+ ret i32 %1
+}
+
+define i1 @slet(i32 %a, i32 %b) {
+; CHECK-LABEL: slet:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.slet a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = icmp sle i32 %a, %b
+ ret i1 %1
+}
+
+define i1 @sletu(i32 %a, i32 %b) {
+; CHECK-LABEL: sletu:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.sletu a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = icmp ule i32 %a, %b
+ ret i1 %1
+}
+
+define i32 @smin(i32 %a, i32 %b) {
+; CHECK-LABEL: smin:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.min a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = call i32 @llvm.smin.i32(i32 %a, i32 %b)
+ ret i32 %1
+}
+
+define i32 @umin(i32 %a, i32 %b) {
+; CHECK-LABEL: umin:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.minu a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = call i32 @llvm.umin.i32(i32 %a, i32 %b)
+ ret i32 %1
+}
+
+define i32 @smax(i32 %a, i32 %b) {
+; CHECK-LABEL: smax:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.max a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = call i32 @llvm.smax.i32(i32 %a, i32 %b)
+ ret i32 %1
+}
+
+define i32 @umax(i32 %a, i32 %b) {
+; CHECK-LABEL: umax:
+; CHECK: # %bb.0:
+; CHECK-NEXT: cv.maxu a0, a0, a1
+; CHECK-NEXT: ret
+ %1 = call i32 @llvm.umax.i32(i32 %a, i32 %b)
+ ret i32 %1
+}
+
+define i32 @exths(i16 %a) {
+; CHECK-LABEL: exths:
+; CHECK: # %bb.0:
+; CHECK-NEXT: # kill: def $x11 killed $x10
+; CHECK-NEXT: cv.exths a0, a0
+; CHECK-NEXT: ret
+ %1 = sext i16 %a to i32
+ ret i32 %1
+}
+
+define i32 @exthz(i16 %a) {
+; CHECK-LABEL: exthz:
+; CHECK: # %bb.0:
+; CHECK-NEXT: # kill: def $x11 killed $x10
+; CHECK-NEXT: cv.exthz a0, a0
+; CHECK-NEXT: ret
+ %1 = zext i16 %a to i32
+ ret i32 %1
+}
+
+define i32 @extbs(i8 %a) {
+; CHECK-LABEL: ...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
0c00753
to
92364bd
Compare
92364bd
to
437d6e0
Compare
|
||
def : Pat<(clip powerOf2Minus1:$upperBound, (XLenVT GPR:$rs1)), | ||
(CV_CLIP GPR:$rs1, (trailing1sPlus1 imm:$upperBound))>; | ||
def : Pat<(between (not GPR:$rs2), GPR:$rs2, (XLenVT GPR:$rs1)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if (not GPR:$rs2) is not less than GPR:$rs2? Then they aren't lower and upper bound.
Does the hardware implement the checks in this order
if rs1 <= -2^(Is2-1), rD = -2^(Is2-1),
else if rs1 >= 2^(Is2-1)-1, rD = 2^(Is2-1)-1,
else rD = rs1
If so then I think we can only match the pattern where the smax is done before the smin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alive2 proof that shows the order of smin/smax matter if you don't know that rs2 is positive. https://alive2.llvm.org/ce/z/dpHmEL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@topperc Good point. The hardware actually treats GPR:$rs2 as unsigned, so it makes sense to generate this instruction only if rs2 is unsigned (e.g.: if it is an ABS node). The fix should require the node there to be unsigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@topperc Good point. The hardware actually treats GPR:$rs2 as unsigned, so it makes sense to generate this instruction only if rs2 is unsigned (e.g.: if it is an ABS node). The fix should require the node there to be unsigned.
cv.abs a0, a1
cv.clipr a0, a0, a1
Is this what we should generate to address this issue? If so, I have proposed a new pattern that generates this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "unsigned" do you mean bit 31 must be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a fair point. I raised this issue with the hw group and we should have soon an update that clarifies this. I think that having bit 31 zero (so rs2 being positive) is a requirement for correctness of the operation. In that case we'd need a check on the operand value. I agree with separating this patch into two for now: intrinsics support and codegen.
437d6e0
to
74eb196
Compare
; CHECK: # %bb.0: | ||
; CHECK-NEXT: cv.exthz a0, a0 | ||
; CHECK-NEXT: ret | ||
%1 = call i32 @llvm.riscv.cv.alu.exthz(i32 %a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need an intrinsic for exthz? Isn't this just AND with 0xffff?
8fcaa52
to
ffddcd8
Compare
Can we please split up this patch so I can approve the simple cases? |
I have prepared the striped version in this pull request. #85603 |
… in CV32E40P Implement XCValu intrinsics for CV32E40P according to the specification. This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins. Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie.
ba337eb
to
2a43d6c
Compare
Implement XCValu intrinsics and CodeGen for CV32E40P according to the specification.
This commit is part of a patch-set to upstream the vendor specific extensions of CV32E40P that need LLVM intrinsics to implement Clang builtins.
Contributors: @CharKeaney, @ChunyuLiao, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie.