-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[RISCV] Inline Assembly Support for GPR Pairs ('R') #112983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is the plan to make i128 a legal type for RV64? |
I think that whichever type I choose for pairs, I will probably have to end up making legal? I'm still working through that. I know right now my testcases crash because the pair type is not legal, but I think I have the opportunity to break up the pair type into two registers in I remember I had an absolute nightmare with constraints for fp16/bf16 in the ARM target in 2022, probably again because of type legality, and I'm keen to avoid that this time. |
I'm concerned that if we make i128 legal, we will have to write custom splitting code for every operation. Using v2i64 probably has other problems conflicting with the V extension fixed vector support. The Arch64 MVT::i64x8 may be a way to work around this. We could have a special pair MVT only used by inline assembly. |
I sort-of expected that making
Ok, this sounds doable - so something like |
Turns out that MVTs really want a size (unless they're overloadable, which is a complex and under-documented property), so I'll be going with two different MVTs rather than trying to share one. |
fbb680b
to
0550970
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
✅ With the latest revision this PR passed the C/C++ code formatter. |
I gave up on using I'm still doing Selection entirely in C++, as I think we'd need to go to C++ anyway to emit the right REG_SEQUENCE and EXTRACT_SUBREG. I've not implemented anything in Instead of looking at combines, I'm going to try to get the tests to work with GlobalISel. |
This comment was marked as off-topic.
This comment was marked as off-topic.
040a7c4
to
2e939b6
Compare
This comment was marked as outdated.
This comment was marked as outdated.
2e939b6
to
483abe4
Compare
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-risc-v Author: Sam Elliott (lenary) ChangesThis patch adds support for getting even-odd general purpose register There are a few different pieces to this patch, each of which need their Target-Independent Changes:
RISC-V Changes:
Patch is 231.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112983.diff 17 Files Affected:
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index eaaba7642bd7b2..07bf002ed73928 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -108,6 +108,14 @@ bool RISCVTargetInfo::validateAsmConstraint(
return true;
}
return false;
+ case 'P':
+ // An even-odd register pair - GPR
+ if (Name[1] == 'r') {
+ Info.setAllowsRegister();
+ Name += 1;
+ return true;
+ }
+ return false;
case 'v':
// A vector register.
if (Name[1] == 'r' || Name[1] == 'd' || Name[1] == 'm') {
@@ -122,8 +130,9 @@ bool RISCVTargetInfo::validateAsmConstraint(
std::string RISCVTargetInfo::convertConstraint(const char *&Constraint) const {
std::string R;
switch (*Constraint) {
- // c* and v* are two-letter constraints on RISC-V.
+ // c*, P*, and v* are all two-letter constraints on RISC-V.
case 'c':
+ case 'P':
case 'v':
R = std::string("^") + std::string(Constraint, 2);
Constraint += 1;
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm.c b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
index 75b91d3c497c50..eb6e42f3eb9529 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
@@ -33,6 +33,19 @@ void test_cf(float f, double d) {
asm volatile("" : "=cf"(cd) : "cf"(d));
}
+#if __riscv_xlen == 32
+typedef long long double_xlen_t;
+#elif __riscv_xlen == 64
+typedef __int128_t double_xlen_t;
+#endif
+double_xlen_t test_Pr_wide_scalar(double_xlen_t p) {
+// CHECK-LABEL: define{{.*}} {{i128|i64}} @test_Pr_wide_scalar(
+// CHECK: call {{i128|i64}} asm sideeffect "", "=^Pr,^Pr"({{i128|i64}} %{{.*}})
+ double_xlen_t ret;
+ asm volatile("" : "=Pr"(ret) : "Pr"(p));
+ return ret;
+}
+
void test_I(void) {
// CHECK-LABEL: define{{.*}} void @test_I()
// CHECK: call void asm sideeffect "", "I"(i32 2047)
diff --git a/llvm/include/llvm/CodeGen/ValueTypes.td b/llvm/include/llvm/CodeGen/ValueTypes.td
index 493c0cfcab60ce..9c910c0085fce9 100644
--- a/llvm/include/llvm/CodeGen/ValueTypes.td
+++ b/llvm/include/llvm/CodeGen/ValueTypes.td
@@ -317,20 +317,23 @@ def riscv_nxv16i8x3 : VTVecTup<384, 3, i8, 220>; // RISCV vector tuple(min_num_
def riscv_nxv16i8x4 : VTVecTup<512, 4, i8, 221>; // RISCV vector tuple(min_num_elts=16, nf=4)
def riscv_nxv32i8x2 : VTVecTup<512, 2, i8, 222>; // RISCV vector tuple(min_num_elts=32, nf=2)
-def x86mmx : ValueType<64, 223>; // X86 MMX value
-def Glue : ValueType<0, 224>; // Pre-RA sched glue
-def isVoid : ValueType<0, 225>; // Produces no value
-def untyped : ValueType<8, 226> { // Produces an untyped value
+def riscv_i32_pair : ValueType<64, 223>; // RISCV pair of RV32 GPRs
+def riscv_i64_pair : ValueType<128, 224>; // RISCV pair of RV64 GPRs
+
+def x86mmx : ValueType<64, 225>; // X86 MMX value
+def Glue : ValueType<0, 226>; // Pre-RA sched glue
+def isVoid : ValueType<0, 227>; // Produces no value
+def untyped : ValueType<8, 228> { // Produces an untyped value
let LLVMName = "Untyped";
}
-def funcref : ValueType<0, 227>; // WebAssembly's funcref type
-def externref : ValueType<0, 228>; // WebAssembly's externref type
-def exnref : ValueType<0, 229>; // WebAssembly's exnref type
-def x86amx : ValueType<8192, 230>; // X86 AMX value
-def i64x8 : ValueType<512, 231>; // 8 Consecutive GPRs (AArch64)
+def funcref : ValueType<0, 229>; // WebAssembly's funcref type
+def externref : ValueType<0, 230>; // WebAssembly's externref type
+def exnref : ValueType<0, 231>; // WebAssembly's exnref type
+def x86amx : ValueType<8192, 232>; // X86 AMX value
+def i64x8 : ValueType<512, 233>; // 8 Consecutive GPRs (AArch64)
def aarch64svcount
- : ValueType<16, 232>; // AArch64 predicate-as-counter
-def spirvbuiltin : ValueType<0, 233>; // SPIR-V's builtin type
+ : ValueType<16, 234>; // AArch64 predicate-as-counter
+def spirvbuiltin : ValueType<0, 235>; // SPIR-V's builtin type
let isNormalValueType = false in {
def token : ValueType<0, 504>; // TokenTy
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 758b3a5fc526e7..053d8ba098d9e5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -5730,7 +5730,8 @@ TargetLowering::ParseConstraints(const DataLayout &DL,
assert(!Call.getType()->isVoidTy() && "Bad inline asm!");
if (auto *STy = dyn_cast<StructType>(Call.getType())) {
OpInfo.ConstraintVT =
- getSimpleValueType(DL, STy->getElementType(ResNo));
+ getAsmOperandValueType(DL, STy->getElementType(ResNo))
+ .getSimpleVT();
} else {
assert(ResNo == 0 && "Asm only has one result!");
OpInfo.ConstraintVT =
diff --git a/llvm/lib/CodeGen/ValueTypes.cpp b/llvm/lib/CodeGen/ValueTypes.cpp
index e3c746b274dde1..7ce7102fe98a5f 100644
--- a/llvm/lib/CodeGen/ValueTypes.cpp
+++ b/llvm/lib/CodeGen/ValueTypes.cpp
@@ -177,6 +177,10 @@ std::string EVT::getEVTString() const {
if (isFloatingPoint())
return "f" + utostr(getSizeInBits());
llvm_unreachable("Invalid EVT!");
+ case MVT::riscv_i32_pair:
+ return "riscv_i32_pair";
+ case MVT::riscv_i64_pair:
+ return "riscv_i64_pair";
case MVT::bf16: return "bf16";
case MVT::ppcf128: return "ppcf128";
case MVT::isVoid: return "isVoid";
@@ -214,6 +218,8 @@ Type *EVT::getTypeForEVT(LLVMContext &Context) const {
assert(isExtended() && "Type is not extended!");
return LLVMTy;
case MVT::isVoid: return Type::getVoidTy(Context);
+ case MVT::riscv_i32_pair: return IntegerType::get(Context, 64);
+ case MVT::riscv_i64_pair: return IntegerType::get(Context, 128);
case MVT::x86mmx: return llvm::FixedVectorType::get(llvm::IntegerType::get(Context, 64), 1);
case MVT::aarch64svcount:
return TargetExtType::get(Context, "aarch64.svcount");
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4d46afb8c4ef97..1b23b36a59e0ec 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -481,6 +481,12 @@ struct RISCVOperand final : public MCParsedAsmOperand {
RISCVMCRegisterClasses[RISCV::GPRRegClassID].contains(Reg.RegNum);
}
+ bool isGPRPair() const {
+ return Kind == KindTy::Register &&
+ RISCVMCRegisterClasses[RISCV::GPRPairRegClassID].contains(
+ Reg.RegNum);
+ }
+
bool isGPRF16() const {
return Kind == KindTy::Register &&
RISCVMCRegisterClasses[RISCV::GPRF16RegClassID].contains(Reg.RegNum);
@@ -491,17 +497,17 @@ struct RISCVOperand final : public MCParsedAsmOperand {
RISCVMCRegisterClasses[RISCV::GPRF32RegClassID].contains(Reg.RegNum);
}
- bool isGPRAsFPR() const { return isGPR() && Reg.IsGPRAsFPR; }
- bool isGPRAsFPR16() const { return isGPRF16() && Reg.IsGPRAsFPR; }
- bool isGPRAsFPR32() const { return isGPRF32() && Reg.IsGPRAsFPR; }
- bool isGPRPairAsFPR() const { return isGPRPair() && Reg.IsGPRAsFPR; }
-
- bool isGPRPair() const {
+ bool isGPRF64Pair() const {
return Kind == KindTy::Register &&
- RISCVMCRegisterClasses[RISCV::GPRPairRegClassID].contains(
+ RISCVMCRegisterClasses[RISCV::GPRF64PairRegClassID].contains(
Reg.RegNum);
}
+ bool isGPRAsFPR() const { return isGPR() && Reg.IsGPRAsFPR; }
+ bool isGPRAsFPR16() const { return isGPRF16() && Reg.IsGPRAsFPR; }
+ bool isGPRAsFPR32() const { return isGPRF32() && Reg.IsGPRAsFPR; }
+ bool isGPRPairAsFPR64() const { return isGPRF64Pair() && Reg.IsGPRAsFPR; }
+
static bool evaluateConstantImm(const MCExpr *Expr, int64_t &Imm,
RISCVMCExpr::VariantKind &VK) {
if (auto *RE = dyn_cast<RISCVMCExpr>(Expr)) {
@@ -2399,7 +2405,7 @@ ParseStatus RISCVAsmParser::parseGPRPairAsFPR64(OperandVector &Operands) {
const MCRegisterInfo *RI = getContext().getRegisterInfo();
MCRegister Pair = RI->getMatchingSuperReg(
Reg, RISCV::sub_gpr_even,
- &RISCVMCRegisterClasses[RISCV::GPRPairRegClassID]);
+ &RISCVMCRegisterClasses[RISCV::GPRF64PairRegClassID]);
Operands.push_back(RISCVOperand::createReg(Pair, S, E, /*isGPRAsFPR=*/true));
return ParseStatus::Success;
}
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index dc3f8254cb4e00..1abb693eb47665 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -953,6 +953,35 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
ReplaceNode(Node, Res);
return;
}
+ case RISCVISD::BuildGPRPair: {
+ SDValue Ops[] = {
+ CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32),
+ Node->getOperand(0),
+ CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32),
+ Node->getOperand(1),
+ CurDAG->getTargetConstant(RISCV::sub_gpr_odd, DL, MVT::i32)};
+
+ SDNode *N = CurDAG->getMachineNode(TargetOpcode::REG_SEQUENCE, DL,
+ Subtarget->getXLenPairVT(), Ops);
+ ReplaceNode(Node, N);
+ return;
+ }
+ case RISCVISD::SplitGPRPair: {
+ if (!SDValue(Node, 0).use_empty()) {
+ SDValue Lo = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_even, DL, VT,
+ Node->getOperand(0));
+ ReplaceUses(SDValue(Node, 0), Lo);
+ }
+
+ if (!SDValue(Node, 1).use_empty()) {
+ SDValue Hi = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_odd, DL, VT,
+ Node->getOperand(0));
+ ReplaceUses(SDValue(Node, 1), Hi);
+ }
+
+ CurDAG->RemoveDeadNode(Node);
+ return;
+ }
case RISCVISD::BuildPairF64: {
if (!Subtarget->hasStdExtZdinx())
break;
@@ -960,7 +989,7 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
assert(!Subtarget->is64Bit() && "Unexpected subtarget");
SDValue Ops[] = {
- CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32),
+ CurDAG->getTargetConstant(RISCV::GPRF64PairRegClassID, DL, MVT::i32),
Node->getOperand(0),
CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32),
Node->getOperand(1),
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 69112d868bff82..a439cccb38f345 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -114,9 +114,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
}
MVT XLenVT = Subtarget.getXLenVT();
+ MVT XLenPairVT = Subtarget.getXLenPairVT();
// Set up the register classes.
addRegisterClass(XLenVT, &RISCV::GPRRegClass);
+ addRegisterClass(XLenPairVT, &RISCV::GPRPairRegClass);
if (Subtarget.hasStdExtZfhmin())
addRegisterClass(MVT::f16, &RISCV::FPR16RegClass);
@@ -134,7 +136,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
if (Subtarget.is64Bit())
addRegisterClass(MVT::f64, &RISCV::GPRRegClass);
else
- addRegisterClass(MVT::f64, &RISCV::GPRPairRegClass);
+ addRegisterClass(MVT::f64, &RISCV::GPRF64PairRegClass);
}
static const MVT::SimpleValueType BoolVecVTs[] = {
@@ -296,6 +298,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
setCondCodeAction(ISD::SETLE, XLenVT, Expand);
}
+ if (Subtarget.is64Bit())
+ setOperationAction(ISD::BITCAST, MVT::i128, Custom);
+ else
+ setOperationAction(ISD::BITCAST, MVT::i64, Custom);
+
setOperationAction({ISD::STACKSAVE, ISD::STACKRESTORE}, MVT::Other, Expand);
setOperationAction(ISD::VASTART, MVT::Other, Custom);
@@ -2224,6 +2231,17 @@ bool RISCVTargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
return Index == 0 || Index == ResElts;
}
+EVT RISCVTargetLowering::getAsmOperandValueType(const DataLayout &DL, Type *Ty,
+ bool AllowUnknown) const {
+ if (!Subtarget.is64Bit() && Ty->isIntegerTy(64))
+ return MVT::riscv_i32_pair;
+
+ if (Subtarget.is64Bit() && Ty->isIntegerTy(128))
+ return MVT::riscv_i64_pair;
+
+ return TargetLowering::getAsmOperandValueType(DL, Ty, AllowUnknown);
+}
+
MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context,
CallingConv::ID CC,
EVT VT) const {
@@ -2238,6 +2256,17 @@ MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context,
return PartVT;
}
+unsigned
+RISCVTargetLowering::getNumRegisters(LLVMContext &Context, EVT VT,
+ std::optional<MVT> RegisterVT) const {
+ // Pair inline assembly operand
+ if (VT == (Subtarget.is64Bit() ? MVT::i128 : MVT::i64) && RegisterVT &&
+ *RegisterVT == Subtarget.getXLenPairVT())
+ return 1;
+
+ return TargetLowering::getNumRegisters(Context, VT, RegisterVT);
+}
+
unsigned RISCVTargetLowering::getNumRegistersForCallingConv(LLVMContext &Context,
CallingConv::ID CC,
EVT VT) const {
@@ -2776,6 +2805,19 @@ RISCVTargetLowering::computeVLMAXBounds(MVT VecVT,
return std::make_pair(MinVLMAX, MaxVLMAX);
}
+bool RISCVTargetLowering::isLoadBitCastBeneficial(
+ EVT LoadVT, EVT BitcastVT, const SelectionDAG &DAG,
+ const MachineMemOperand &MMO) const {
+ // We want to leave `bitcasts` to/from MVT::riscv_i<xlen>_pair separate from
+ // loads/stores so they can be turned into BuildGPRPair/::SplitGPRPair nodes.
+ if (LoadVT == (Subtarget.is64Bit() ? MVT::i128 : MVT::i64) &&
+ BitcastVT == Subtarget.getXLenPairVT())
+ return false;
+
+ return TargetLoweringBase::isLoadBitCastBeneficial(LoadVT, BitcastVT, DAG,
+ MMO);
+}
+
// The state of RVV BUILD_VECTOR and VECTOR_SHUFFLE lowering is that very few
// of either is (currently) supported. This can get us into an infinite loop
// where we try to lower a BUILD_VECTOR as a VECTOR_SHUFFLE as a BUILD_VECTOR
@@ -6413,6 +6455,13 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
std::tie(Lo, Hi) = DAG.SplitScalar(Op0, DL, MVT::i32, MVT::i32);
return DAG.getNode(RISCVISD::BuildPairF64, DL, MVT::f64, Lo, Hi);
}
+ if (VT == Subtarget.getXLenPairVT() && Op0VT.isScalarInteger() &&
+ Op0VT.getSizeInBits() == 2 * Subtarget.getXLen()) {
+ SDValue Lo, Hi;
+ std::tie(Lo, Hi) = DAG.SplitScalar(Op0, DL, XLenVT, XLenVT);
+ return DAG.getNode(RISCVISD::BuildGPRPair, DL, Subtarget.getXLenPairVT(),
+ Lo, Hi);
+ }
// Consider other scalar<->scalar casts as legal if the types are legal.
// Otherwise expand them.
@@ -12886,6 +12935,14 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N,
SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64,
NewReg.getValue(0), NewReg.getValue(1));
Results.push_back(RetReg);
+ } else if (VT.isInteger() &&
+ VT.getSizeInBits() == 2 * Subtarget.getXLen() &&
+ Op0VT == Subtarget.getXLenPairVT()) {
+ SDValue NewReg = DAG.getNode(RISCVISD::SplitGPRPair, DL,
+ DAG.getVTList(XLenVT, XLenVT), Op0);
+ SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, VT, NewReg.getValue(0),
+ NewReg.getValue(1));
+ Results.push_back(RetReg);
} else if (!VT.isVector() && Op0VT.isFixedLengthVector() &&
isTypeLegal(Op0VT)) {
// Custom-legalize bitcasts from fixed-length vector types to illegal
@@ -20130,6 +20187,8 @@ const char *RISCVTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(TAIL)
NODE_NAME_CASE(SELECT_CC)
NODE_NAME_CASE(BR_CC)
+ NODE_NAME_CASE(BuildGPRPair)
+ NODE_NAME_CASE(SplitGPRPair)
NODE_NAME_CASE(BuildPairF64)
NODE_NAME_CASE(SplitF64)
NODE_NAME_CASE(ADD_LO)
@@ -20408,6 +20467,8 @@ RISCVTargetLowering::getConstraintType(StringRef Constraint) const {
return C_RegisterClass;
if (Constraint == "cr" || Constraint == "cf")
return C_RegisterClass;
+ if (Constraint == "Pr")
+ return C_RegisterClass;
}
return TargetLowering::getConstraintType(Constraint);
}
@@ -20429,7 +20490,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
return std::make_pair(0U, &RISCV::GPRF32NoX0RegClass);
if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
- return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
+ return std::make_pair(0U, &RISCV::GPRF64PairNoX0RegClass);
return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
case 'f':
if (VT == MVT::f16) {
@@ -20446,7 +20507,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
if (Subtarget.hasStdExtD())
return std::make_pair(0U, &RISCV::FPR64RegClass);
if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
- return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
+ return std::make_pair(0U, &RISCV::GPRF64PairNoX0RegClass);
if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
}
@@ -20488,7 +20549,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
return std::make_pair(0U, &RISCV::GPRF32CRegClass);
if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
- return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+ return std::make_pair(0U, &RISCV::GPRF64PairCRegClass);
if (!VT.isVector())
return std::make_pair(0U, &RISCV::GPRCRegClass);
} else if (Constraint == "cf") {
@@ -20506,10 +20567,12 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
if (Subtarget.hasStdExtD())
return std::make_pair(0U, &RISCV::FPR64CRegClass);
if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
- return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+ return std::make_pair(0U, &RISCV::GPRF64PairCRegClass);
if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
return std::make_pair(0U, &RISCV::GPRCRegClass);
}
+ } else if (Constraint == "Pr") {
+ return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
}
// Clang will correctly decode the usage of register name aliases into their
@@ -20670,7 +20733,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
// Subtarget into account.
if (Res.second == &RISCV::GPRF16RegClass ||
Res.second == &RISCV::GPRF32RegClass ||
- Res.second == &RISCV::GPRPairRegClass)
+ Res.second == &RISCV::GPRF64PairRegClass)
return std::make_pair(Res.first, &RISCV::GPRRegClass);
return Res;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 0b07ad7d7a423f..deaefafc73535e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -44,6 +44,18 @@ enum NodeType : unsigned {
SELECT_CC,
BR_CC,
+ /// Turn a pair of `i<xlen>`s into a `riscv_i<xlen>_pair`.
+ /// - Output: `riscv_i<xlen>_pair`
+ /// - Input 0: `i<xlen>` low-order bits, for even register.
+ /// - Input 1: `i<xlen>` high-order bits, for odd register.
+ BuildGPRPair,
+
+ /// Turn a `riscv_i<xlen>_pair` into a pair of `i<xlen>`s.
+ /// - Output 0: `i<xlen>` low-order bits, from even register.
+ /// - Output 1: `i<xlen>` high-order bits, from odd register.
+ /// - Input: `riscv_i<xlen>_pair`
+ SplitGPRPair,
+
/// Turns a pair of `i32`s into an `f64`. Needed for rv32d/ilp32.
/// - Output: `f64`.
/// - Input 0: low-order bits (31-0) (as `i32`), for even register.
@@ -544,11 +556,19 @@ class RISCVTargetLowering : public TargetLowering {
...
[truncated]
|
Today's changes address all the CI failures, by adding better checking before turning BITCAST into |
ping? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I just had a detailed look. Given that you have explained almost all the code detailedly, I think this PR looks great to me!
Just some overall comments:
- I personally like your proposal of adding new constraints, but we still need the agreement between community members.
- I saw all the comments above and I know reason why we choose to add new MVT types. My question is, maybe we can make it less target-specific? I don't think this is a RISC-V only problem.
- We should be able to use
Pr
for Zacas now? So maybe we should add some tests for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I just had a detailed look. Given that you have explained almost all the code detailedly, I think this PR looks great to me! Just some overall comments:
- I personally like your proposal of adding new constraints, but we still need the agreement between community members.
I think some of the agreement needs an implementation, maybe? I'll ping the c-api issue.
- I saw all the comments above and I know reason why we choose to add new MVT types. My question is, maybe we can make it less target-specific? I don't think this is a RISC-V only problem.
I noticed, after-the-fact, that SystemZ uses MVT::Untyped
(corrected) for pairs, I can prepare a fixup commit that does this (and removes the riscv_*_pair
MVTs), if we think that's a better target-independent approach?
- We should be able to use
Pr
for Zacas now? So maybe we should add some tests for it.
- The test functions which have an in-out
Pr
parameter (test_Pr_wide_scalar_inout
) are derived from the issue aboutamocas.q
in inline assembly on RV64. The C from that issue found some bugs in the implementation, which is why those tests exist. These don't reference any specific instructions just to make them have less complex dependencies - which is why the text of the inline assembly is just a comment to verify we're producing sensible output. - The Zacas pair instructions continue to use
GPRPair
in their definitions, rather than the newGPRF64Pair
register classes, which seemed the right way forwards. There are no patterns for the pair instructions at the moment, but I'm not sure I want to change the codegen of atomics, that's a whole other nightmare. Do you have specific tests in mind?
@@ -2238,6 +2256,17 @@ MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context, | |||
return PartVT; | |||
} | |||
|
|||
unsigned | |||
RISCVTargetLowering::getNumRegisters(LLVMContext &Context, EVT VT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we need this change but AArch64 didn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to prevent an assert that is hit in RegsForValue::AddInlineAsmOperands
without this code.
I think AArch64 is not hitting this because they didn't test very much - the amocas.q
example has a lot of hard cases (multiple returns, matching input/output operands) which hit a lot of the hard cases in SelectionDAGBuilder. Note I had to fix one place where AArch64 was definitely wrong - the changes to call getAsmOperandValueType
when there are multiple outputs from asm - AArch64 should have hit this case and didn't.
I've spent a few more hours this evening trying to trace down this assert, and I feel closer, but not quite there. The testcase to use to get where I've got to (with the assert) is test_Pr_wide_scalar_inout
.
SelectionDAGBuilder.cpp's getRegistersForValue
seems to be doing the right thing, always calling this with the VT
and RegisterVT
with the same values, as far as I can tell. It seems to create the OpInfo.AssignedRegs
in such a way that the later call to getNumRegisters
in RegsForValue::AddInlineAsmOperands
will get the same value as it did when it was created.
Where this seems to go wrong is the matching inputs code in SelectionDAGBuilder.cpp -
llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Lines 10119 to 10139 in 8b55162
MachineFunction &MF = DAG.getMachineFunction(); | |
MachineRegisterInfo &MRI = MF.getRegInfo(); | |
const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo(); | |
auto *R = cast<RegisterSDNode>(AsmNodeOperands[CurOp+1]); | |
Register TiedReg = R->getReg(); | |
MVT RegVT = R->getSimpleValueType(0); | |
const TargetRegisterClass *RC = | |
TiedReg.isVirtual() ? MRI.getRegClass(TiedReg) | |
: RegVT != MVT::Untyped ? TLI.getRegClassFor(RegVT) | |
: TRI.getMinimalPhysRegClass(TiedReg); | |
for (unsigned i = 0, e = Flag.getNumOperandRegisters(); i != e; ++i) | |
Regs.push_back(MRI.createVirtualRegister(RC)); | |
RegsForValue MatchedRegs(Regs, RegVT, InOperandVal.getValueType()); | |
SDLoc dl = getCurSDLoc(); | |
// Use the produced MatchedRegs object to | |
MatchedRegs.getCopyToRegs(InOperandVal, DAG, dl, Chain, &Glue, &Call); | |
MatchedRegs.AddInlineAsmOperands(InlineAsm::Kind::RegUse, true, | |
OpInfo.getMatchedOperand(), dl, DAG, | |
AsmNodeOperands); |
This seems to have its own logic about creating the RegsForValue MatchedRegs(...)
based on the surrounding DAG Context, and seems to avoid the info available in OpInfo
which should have some relevancy, instead this code is directly pulling out similar (but not quite identical) information from the DAG. When getNumRegisters
is eventually called in MatchedRegs::AddInlineAsmOperands
, it uses MVT::i128
rather than MVT::riscv_i64_pair
, which means it thinks it needs two registers, not just one. I think this MVT::i128
is coming from InOperandVal.getValueType()
when MatchedRegs is created.
When this code is compared to getRegistersForValue
, the two really don't seem to line up, logically - I would expect them both to be doing similar things, but they're not - how the RegsForValue
object is created in getRegistersForValue
is reasonably different to this logic.
git-blame-ing this code, it seems to have been introduced for supporting i128 on SystemZ - presumably also for paired operands - and they do have a getNumRegisters
override, which returns 1
- https://reviews.llvm.org/D100788 (refactors have hit a good number of lines around the SelectionDAGBuilder.cpp visitInlineAsm code I believe to be at fault, but nothing that's not NFC since that change).
I don't feel good about the matching inputs code in SelectionDAGBuilder - I don't understand it, and I don't think my addition to getNumRegisters
should be necessary if SelectionDAGBuilder.cpp worked more closely to how getRegistersForValue
works. The extra confusing thing here is that getRegistersForValue
might have mutated OpInfo
when there's a matched input - but returned std::nullopt
, so maybe the logic for matched inputs needs to just match the tail end of getRegistersForValue
. I'm not entirely sure.
You're a lot more familiar with SelectionDAG than I am, do you have advice for what I should be doing here? It does seem like more target-independent code needs to be fixed to get rid of this. Maybe that can come later?
Gentle Ping. I'm looking for answers to two questions:
|
I guess so. I didn't know about the SystemZ change before. I do find it a little weird having 2 register classes with the same spill size and registers. I believe the tablegen generated code considers them subclasses of each other. X86 has some register classes like this too so I guess its not a big deal.
I think we can live with the override since SystemZ is doing it too. |
I'm working on this again and tripping over the commit that changes the branch relaxation tests, so I'm planning to just push that to |
I'll try it, and push a separate commit, so we can see what else is needed.
I'm not very happy about the duplication either, but it's because of a limitation where fairly early on, selectiondag uses the first legal type in a class, rather than the one you might want.
|
841e38a
to
126e4ee
Compare
The new approach using MVT::Untyped is uploaded. Inspect commit 126e4ee to see what changed to go from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers! It seems using untyped
saves some lines as well!
LGTM with nits.
ReplaceNode(Node, N); | ||
return; | ||
} | ||
case RISCVISD::SplitGPRPair: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we share this code with SplitF64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Happy to in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up doing this as I have to do a big rebase to sort out the comments/reorg of RISCVRegisterinfo.td
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM we can do the code sharing as a follow up if we want
I will rewrite the description of the PR momentarily, once I've rebased over the cleanup/comments to RISCVRegisterinfo.td |
This patch adds support for getting even-odd general purpose register pairs into and out of inline assembly using the `Pr` constraint as proposed in riscv-non-isa/riscv-c-api-doc#92 There are a few different pieces to this patch, each of which need their own explanation. - Renames the Register Class used for f64 values on rv32i_zdinx from `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly unmodified, as their primary value type is used for type inference over selection patterns. This rename affects quite a lot of files. - Adds new `GPRPair*` register classes which will be used for `Pr` constraints and for instructions that need an even-odd GPR pair. This new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in Zacas, instead of the `GPRF64Pair` class being used before. - Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`. Two new RISCVISD node types are added for creating and destructing a pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when bitcasting to/from the pair type and `untyped`. - Adds functionality to `splitValueIntoRegisterParts` and `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into `untyped` pairs. - Adds an override for `getNumRegisters` to ensure that `i<2*xlen>` values, when going to/from inline assembly, only allocate one (pair) register (they would otherwise allocate two). This is due to a bug in SelectionDAGBuilder.cpp which other backends also work around. - Ensures that Clang understands that `Pr` is a valid inline assembly constraint. - Adds Conditions to the GPRF64Pair-related changes to `LowerOperation` and `ReplaceNodeResults` which match when BITCAST for the relevant types should be handled in a custom manner. - This also allows `Pr` to be used for `f64` types on `rv32_zdinx` architectures, where doubles are stored in a GPR pair.
8d538f8
to
0b98a56
Compare
The commit I just pushed up should be the final version, and not fundamentally different from what was reviewed (but with various NFC patches split out). I think I do have a route towards merging GPRPair and GPRF64Pair, but I'll do that in a follow-up, as there's a mystery crash somewhere that I haven't tracked down yet. |
I'm not sure this requires re-review, so I plan to land this tomorrow (UK time) if I don't hear anything else. |
[RISCV] GPR Pairs for Inline Asm using `R` This patch adds support for getting even-odd general purpose register pairs into and out of inline assembly using the `R` constraint as proposed in riscv-non-isa/riscv-c-api-doc#92 There are a few different pieces to this patch, each of which need their own explanation. - Renames the Register Class used for f64 values on rv32i_zdinx from `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly unmodified, as their primary value type is used for type inference over selection patterns. This rename affects quite a lot of files. - Adds new `GPRPair*` register classes which will be used for `R` constraints and for instructions that need an even-odd GPR pair. This new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in Zacas, instead of the `GPRF64Pair` class being used before. - Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`. Two new RISCVISD node types are added for creating and destructing a pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when bitcasting to/from the pair type and `untyped`. - Adds functionality to `splitValueIntoRegisterParts` and `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into `untyped` pairs. - Adds an override for `getNumRegisters` to ensure that `i<2*xlen>` values, when going to/from inline assembly, only allocate one (pair) register (they would otherwise allocate two). This is due to a bug in SelectionDAGBuilder.cpp which other backends also work around. - Ensures that Clang understands that `R` is a valid inline assembly constraint. - Adds Conditions to the GPRF64Pair-related changes to `LowerOperation` and `ReplaceNodeResults` which match when BITCAST for the relevant types should be handled in a custom manner. - This also allows `R` to be used for `f64` types on `rv32_zdinx` architectures, where doubles are stored in a GPR pair.
I won't be merging this today. Instead, two changes/updates:
|
Kito and I have agreed |
This was a typo in llvm#112983 that didn't cause build failures but is still wrong.
This was a typo in #112983 that didn't cause build failures but is still wrong.
As suggested by Craig, this tries to merge the two sets of register classes created in #112983, GPRPair* and GPRF64Pair*. - I added some explicit annotations to `RISCVInstrInfoD.td` which fixed the type inference issues I was seeing from tablegen for select patterns. - I've had to make the behaviour of `splitValueIntoRegisterParts` and `joinRegisterPartsIntoValue` cover more cases, because you cannot bitcast to/from untyped (the bitcast would otherwise have been inserted automatically by TargetLowering code). - I apparently didn't need to change `getNumRegisters` again, which continues to tell me there's a bug in the code for tied inputs. I added some more test coverage of this case but it didn't seem to help find the asserts I was finding before - I think the difference is between the default behaviour for integers which doesn't apply to floats. - There's still a difference between BuildGPRPair and BuildPairF64 (and the same for SplitGPRPair and SplitF64). I'm not happy with this, I think it's quite confusing, as they're very similar, just differing in whether they give a `untyped` or a `f64`. I haven't really worked out how the DAGCombiner copes if one meets the other, I know we have some of this for the f64 variants already, but they're a lot more complex than the GPRPair variants anyway.
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the
R
constraint asproposed in riscv-non-isa/riscv-c-api-doc#92
There are a few different pieces to this patch, each of which need their
own explanation.
Renames the Register Class used for f64 values on rv32i_zdinx from
GPRPair*
toGPRF64Pair*
. These register classes are kept broadlyunmodified, as their primary value type is used for type inference
over selection patterns. This rename affects quite a lot of files.
Adds new
GPRPair*
register classes which will be used forr
constraints and for instructions that need an even-odd GPR pair. This
new type is used for
amocas.d.*
(rv32) andamocas.q.*
(rv64) inZacas, instead of the
GPRF64Pair
class being used before.Marks the new
GPRPair
class legal as for holding aMVT::Untyped
.Two new RISCVISD node types are added for creating and destructing a
pair -
BuildGPRPair
andSplitGPRPair
, and are introduced whenbitcasting to/from the pair type and
untyped
.Adds functionality to
splitValueIntoRegisterParts
andjoinRegisterPartsIntoValue
to handle changingi<2*xlen>
MVTs intountyped
pairs.Adds an override for
getNumRegisters
to ensure thati<2*xlen>
values, when going to/from inline assembly, only allocate one (pair)
register (they would otherwise allocate two). This is due to a bug in
SelectionDAGBuilder.cpp which other backends also work around.
Ensures that Clang understands that
R
is a valid inline assemblyconstraint.
This also allows
R
to be used forf64
types onrv32_zdinx
architectures, where doubles are stored in a GPR pair.