Skip to content

[RISCV] Inline Assembly Support for GPR Pairs ('R') #112983

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 18, 2024

Conversation

lenary
Copy link
Member

@lenary lenary commented Oct 18, 2024

This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the R constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

  • Renames the Register Class used for f64 values on rv32i_zdinx from
    GPRPair* to GPRF64Pair*. These register classes are kept broadly
    unmodified, as their primary value type is used for type inference
    over selection patterns. This rename affects quite a lot of files.

  • Adds new GPRPair* register classes which will be used for r
    constraints and for instructions that need an even-odd GPR pair. This
    new type is used for amocas.d.*(rv32) and amocas.q.*(rv64) in
    Zacas, instead of the GPRF64Pair class being used before.

  • Marks the new GPRPair class legal as for holding a MVT::Untyped.
    Two new RISCVISD node types are added for creating and destructing a
    pair - BuildGPRPair and SplitGPRPair, and are introduced when
    bitcasting to/from the pair type and untyped.

  • Adds functionality to splitValueIntoRegisterParts and
    joinRegisterPartsIntoValue to handle changing i<2*xlen> MVTs into
    untyped pairs.

  • Adds an override for getNumRegisters to ensure that i<2*xlen>
    values, when going to/from inline assembly, only allocate one (pair)
    register (they would otherwise allocate two). This is due to a bug in
    SelectionDAGBuilder.cpp which other backends also work around.

  • Ensures that Clang understands that R is a valid inline assembly
    constraint.

  • This also allows R to be used for f64 types on rv32_zdinx
    architectures, where doubles are stored in a GPR pair.

@topperc
Copy link
Collaborator

topperc commented Oct 18, 2024

Is the plan to make i128 a legal type for RV64?

@lenary
Copy link
Member Author

lenary commented Oct 18, 2024

Is the plan to make i128 a legal type for RV64?

I think that whichever type I choose for pairs, I will probably have to end up making legal? I'm still working through that. I know right now my testcases crash because the pair type is not legal, but I think I have the opportunity to break up the pair type into two registers in RISCVTargetLowering::splitValueIntoRegisterParts (or there are other hooks if i choose to represent it with v2ixlen). Having said that, I might just be pushing the problem deeper into ISel if I try that.

I remember I had an absolute nightmare with constraints for fp16/bf16 in the ARM target in 2022, probably again because of type legality, and I'm keen to avoid that this time.

@topperc
Copy link
Collaborator

topperc commented Oct 18, 2024

Is the plan to make i128 a legal type for RV64?

I think that whichever type I choose for pairs, I will probably have to end up making legal? I'm still working through that. I know right now my testcases crash because the pair type is not legal, but I think I have the opportunity to break up the pair type into two registers in RISCVTargetLowering::splitValueIntoRegisterParts (or there are other hooks if i choose to represent it with v2ixlen). Having said that, I might just be pushing the problem deeper into ISel if I try that.

I remember I had an absolute nightmare with constraints for fp16/bf16 in the ARM target in 2022, probably again because of type legality, and I'm keen to avoid that this time.

I'm concerned that if we make i128 legal, we will have to write custom splitting code for every operation. Using v2i64 probably has other problems conflicting with the V extension fixed vector support.

The Arch64 MVT::i64x8 may be a way to work around this. We could have a special pair MVT only used by inline assembly.

@lenary
Copy link
Member Author

lenary commented Oct 18, 2024

I'm concerned that if we make i128 legal, we will have to write custom splitting code for every operation. Using v2i64 probably has other problems conflicting with the V extension fixed vector support.

I sort-of expected that making i128 legal would be problematic, I didn't realise that v2i64 would also conflict with something we're currently using.

The Arch64 MVT::i64x8 may be a way to work around this. We could have a special pair MVT only used by inline assembly.

Ok, this sounds doable - so something like MVT::xlen_pair is the way I'll lean, rather than a different type for rv32/rv64.

@lenary
Copy link
Member Author

lenary commented Oct 21, 2024

The Arch64 MVT::i64x8 may be a way to work around this. We could have a special pair MVT only used by inline assembly.

Ok, this sounds doable - so something like MVT::xlen_pair is the way I'll lean, rather than a different type for rv32/rv64.

Turns out that MVTs really want a size (unless they're overloadable, which is a complex and under-documented property), so I'll be going with two different MVTs rather than trying to share one.

@lenary lenary force-pushed the pr/riscv-inline-asm-pairs branch from fbb680b to 0550970 Compare October 21, 2024 21:22
@lenary

This comment was marked as outdated.

@lenary

This comment was marked as outdated.

Copy link

github-actions bot commented Oct 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@lenary
Copy link
Member Author

lenary commented Oct 22, 2024

I gave up on using splitValueIntoRegisterParts and joinRegisterPartsIntoValue, and instead added a custom legalisation of BITCAST with the 2*xlen type, which matches what we do for BuildPairF64 and SplitF64. This ended up being much more "symmetrical", which was nice.

I'm still doing Selection entirely in C++, as I think we'd need to go to C++ anyway to emit the right REG_SEQUENCE and EXTRACT_SUBREG.

I've not implemented anything in RISCVTargetLowering::PerformDAGCombine for BuildPairGPR and SplitPairGPR - though I see there's something in there for SplitF64, and not all of it is FP-related. I don't know how much of a difference these combines will make when the generated instructions are not as expensive as the stack slot that BuildPairF64/SplitF64 sometimes needs.

Instead of looking at combines, I'm going to try to get the tests to work with GlobalISel.

@lenary

This comment was marked as off-topic.

@lenary lenary force-pushed the pr/riscv-inline-asm-pairs branch from 040a7c4 to 2e939b6 Compare October 22, 2024 19:48
@lenary

This comment was marked as outdated.

@lenary lenary force-pushed the pr/riscv-inline-asm-pairs branch from 2e939b6 to 483abe4 Compare October 23, 2024 18:05
@lenary lenary marked this pull request as ready for review October 23, 2024 18:07
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:SelectionDAG SelectionDAGISel as well labels Oct 23, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2024

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-risc-v

Author: Sam Elliott (lenary)

Changes

This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the Pr constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

Target-Independent Changes:

  • This adds two new Machine Value Types (MVTs), which represent pairs for
    each xlen. Two are needed because MVTs usually have a fixed length. This
    change unfortunately increases the size of SelectionDAG tables indexed
    by MVT by a small percentage.

  • When an inline assembly block returns multiple values, it returns them
    in a struct, rather than as a single value. This fixes TargetLowering
    so that getAsmOperandValueType is called on the types in that
    struct, so that targets have the opportunity to propose their own MVT
    for an inline assembly operand where this wouldn't match conventional
    arguments/return values. This matches what happens when a single value
    is returned.

RISC-V Changes:

  • Renames the Register Class used for f64 values on rv32i_zdinx from
    GPRPair* to GPRF64Pair*. These register classes are kept broadly
    unmodified, as their primary value type is used for type inference
    over selection patterns. This rename affects quite a lot of files. I
    reordered the definitions in RISCVRegisterInfo.td and added headings
    to make it easier to browse.

  • Adds new GPRPair* register classes which will be used for Pr
    constraints and for instructions that need an even-odd GPR pair. This
    new type is used for amocas.d.*(rv32) and amocas.q.*(rv64) in
    Zacas, instead of the GPRF64Pair class being used before.

  • Marks the new GPRPair class legal as for holding a
    MVT::riscv_i&lt;xlen&gt;_pair. Two new RISCVISD node types are added for
    creating and destructing a pair - BuildGPRPair and SplitGPRPair,
    and are introduced when bitcasting to/from the pair type and the
    i&lt;2*xlen&gt; type.

  • This adds an override for getNumRegisters to ensure that i&lt;2*xlen&gt;
    values, when going to/from inline assembly, only allocate one (pair)
    register (they would otherwise allocate two).

  • Ensures that the DAGCombiner doesn't merge the bitcast between
    i&lt;2*xlen&gt; types and the pair type into a load/store, as we want to
    legalise these 2*xlen-wide loads/stores as before - by splitting them
    into two xlen-wide loads/stores, which will happen with i&lt;2*xlen&gt;
    types.

  • Ensures that Clang understands that Pr is a valid inline assembly
    constraint.


Patch is 231.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112983.diff

17 Files Affected:

  • (modified) clang/lib/Basic/Targets/RISCV.cpp (+10-1)
  • (modified) clang/test/CodeGen/RISCV/riscv-inline-asm.c (+13)
  • (modified) llvm/include/llvm/CodeGen/ValueTypes.td (+14-11)
  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+2-1)
  • (modified) llvm/lib/CodeGen/ValueTypes.cpp (+6)
  • (modified) llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp (+14-8)
  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+30-1)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+69-6)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.h (+24)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoD.td (+6-6)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.td (+117-78)
  • (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+4)
  • (added) llvm/test/CodeGen/RISCV/branch-relaxation-rv32.ll (+1010)
  • (added) llvm/test/CodeGen/RISCV/branch-relaxation-rv64.ll (+1013)
  • (removed) llvm/test/CodeGen/RISCV/branch-relaxation.ll (-3226)
  • (added) llvm/test/CodeGen/RISCV/rv32-inline-asm-pairs.ll (+73)
  • (added) llvm/test/CodeGen/RISCV/rv64-inline-asm-pairs.ll (+73)
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index eaaba7642bd7b2..07bf002ed73928 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -108,6 +108,14 @@ bool RISCVTargetInfo::validateAsmConstraint(
       return true;
     }
     return false;
+  case 'P':
+    // An even-odd register pair - GPR
+    if (Name[1] == 'r') {
+      Info.setAllowsRegister();
+      Name += 1;
+      return true;
+    }
+    return false;
   case 'v':
     // A vector register.
     if (Name[1] == 'r' || Name[1] == 'd' || Name[1] == 'm') {
@@ -122,8 +130,9 @@ bool RISCVTargetInfo::validateAsmConstraint(
 std::string RISCVTargetInfo::convertConstraint(const char *&Constraint) const {
   std::string R;
   switch (*Constraint) {
-  // c* and v* are two-letter constraints on RISC-V.
+  // c*, P*, and v* are all two-letter constraints on RISC-V.
   case 'c':
+  case 'P':
   case 'v':
     R = std::string("^") + std::string(Constraint, 2);
     Constraint += 1;
diff --git a/clang/test/CodeGen/RISCV/riscv-inline-asm.c b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
index 75b91d3c497c50..eb6e42f3eb9529 100644
--- a/clang/test/CodeGen/RISCV/riscv-inline-asm.c
+++ b/clang/test/CodeGen/RISCV/riscv-inline-asm.c
@@ -33,6 +33,19 @@ void test_cf(float f, double d) {
   asm volatile("" : "=cf"(cd) : "cf"(d));
 }
 
+#if __riscv_xlen == 32
+typedef long long double_xlen_t;
+#elif __riscv_xlen == 64
+typedef __int128_t double_xlen_t;
+#endif
+double_xlen_t test_Pr_wide_scalar(double_xlen_t p) {
+// CHECK-LABEL: define{{.*}} {{i128|i64}} @test_Pr_wide_scalar(
+// CHECK: call {{i128|i64}} asm sideeffect "", "=^Pr,^Pr"({{i128|i64}} %{{.*}})
+  double_xlen_t ret;
+  asm volatile("" : "=Pr"(ret) : "Pr"(p));
+  return ret;
+}
+
 void test_I(void) {
 // CHECK-LABEL: define{{.*}} void @test_I()
 // CHECK: call void asm sideeffect "", "I"(i32 2047)
diff --git a/llvm/include/llvm/CodeGen/ValueTypes.td b/llvm/include/llvm/CodeGen/ValueTypes.td
index 493c0cfcab60ce..9c910c0085fce9 100644
--- a/llvm/include/llvm/CodeGen/ValueTypes.td
+++ b/llvm/include/llvm/CodeGen/ValueTypes.td
@@ -317,20 +317,23 @@ def riscv_nxv16i8x3  : VTVecTup<384, 3, i8, 220>; // RISCV vector tuple(min_num_
 def riscv_nxv16i8x4  : VTVecTup<512, 4, i8, 221>; // RISCV vector tuple(min_num_elts=16, nf=4)
 def riscv_nxv32i8x2  : VTVecTup<512, 2, i8, 222>; // RISCV vector tuple(min_num_elts=32, nf=2)
 
-def x86mmx    : ValueType<64,   223>;  // X86 MMX value
-def Glue      : ValueType<0,    224>;  // Pre-RA sched glue
-def isVoid    : ValueType<0,    225>;  // Produces no value
-def untyped   : ValueType<8,    226> { // Produces an untyped value
+def riscv_i32_pair : ValueType<64, 223>; // RISCV pair of RV32 GPRs
+def riscv_i64_pair : ValueType<128, 224>; // RISCV pair of RV64 GPRs
+
+def x86mmx    : ValueType<64,   225>;  // X86 MMX value
+def Glue      : ValueType<0,    226>;  // Pre-RA sched glue
+def isVoid    : ValueType<0,    227>;  // Produces no value
+def untyped   : ValueType<8,    228> { // Produces an untyped value
   let LLVMName = "Untyped";
 }
-def funcref   : ValueType<0,    227>;  // WebAssembly's funcref type
-def externref : ValueType<0,    228>;  // WebAssembly's externref type
-def exnref    : ValueType<0,    229>;  // WebAssembly's exnref type
-def x86amx    : ValueType<8192, 230>;  // X86 AMX value
-def i64x8     : ValueType<512,  231>;  // 8 Consecutive GPRs (AArch64)
+def funcref   : ValueType<0,    229>;  // WebAssembly's funcref type
+def externref : ValueType<0,    230>;  // WebAssembly's externref type
+def exnref    : ValueType<0,    231>;  // WebAssembly's exnref type
+def x86amx    : ValueType<8192, 232>;  // X86 AMX value
+def i64x8     : ValueType<512,  233>;  // 8 Consecutive GPRs (AArch64)
 def aarch64svcount
-              : ValueType<16,  232>;  // AArch64 predicate-as-counter
-def spirvbuiltin : ValueType<0, 233>; // SPIR-V's builtin type
+              : ValueType<16,  234>;  // AArch64 predicate-as-counter
+def spirvbuiltin : ValueType<0, 235>; // SPIR-V's builtin type
 
 let isNormalValueType = false in {
 def token      : ValueType<0, 504>;  // TokenTy
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 758b3a5fc526e7..053d8ba098d9e5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -5730,7 +5730,8 @@ TargetLowering::ParseConstraints(const DataLayout &DL,
       assert(!Call.getType()->isVoidTy() && "Bad inline asm!");
       if (auto *STy = dyn_cast<StructType>(Call.getType())) {
         OpInfo.ConstraintVT =
-            getSimpleValueType(DL, STy->getElementType(ResNo));
+            getAsmOperandValueType(DL, STy->getElementType(ResNo))
+                .getSimpleVT();
       } else {
         assert(ResNo == 0 && "Asm only has one result!");
         OpInfo.ConstraintVT =
diff --git a/llvm/lib/CodeGen/ValueTypes.cpp b/llvm/lib/CodeGen/ValueTypes.cpp
index e3c746b274dde1..7ce7102fe98a5f 100644
--- a/llvm/lib/CodeGen/ValueTypes.cpp
+++ b/llvm/lib/CodeGen/ValueTypes.cpp
@@ -177,6 +177,10 @@ std::string EVT::getEVTString() const {
     if (isFloatingPoint())
       return "f" + utostr(getSizeInBits());
     llvm_unreachable("Invalid EVT!");
+  case MVT::riscv_i32_pair:
+    return "riscv_i32_pair";
+  case MVT::riscv_i64_pair:
+    return "riscv_i64_pair";
   case MVT::bf16:      return "bf16";
   case MVT::ppcf128:   return "ppcf128";
   case MVT::isVoid:    return "isVoid";
@@ -214,6 +218,8 @@ Type *EVT::getTypeForEVT(LLVMContext &Context) const {
     assert(isExtended() && "Type is not extended!");
     return LLVMTy;
   case MVT::isVoid:  return Type::getVoidTy(Context);
+  case MVT::riscv_i32_pair: return IntegerType::get(Context, 64);
+  case MVT::riscv_i64_pair: return IntegerType::get(Context, 128);
   case MVT::x86mmx:  return llvm::FixedVectorType::get(llvm::IntegerType::get(Context, 64), 1);
   case MVT::aarch64svcount:
     return TargetExtType::get(Context, "aarch64.svcount");
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4d46afb8c4ef97..1b23b36a59e0ec 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -481,6 +481,12 @@ struct RISCVOperand final : public MCParsedAsmOperand {
            RISCVMCRegisterClasses[RISCV::GPRRegClassID].contains(Reg.RegNum);
   }
 
+  bool isGPRPair() const {
+    return Kind == KindTy::Register &&
+           RISCVMCRegisterClasses[RISCV::GPRPairRegClassID].contains(
+               Reg.RegNum);
+  }
+
   bool isGPRF16() const {
     return Kind == KindTy::Register &&
            RISCVMCRegisterClasses[RISCV::GPRF16RegClassID].contains(Reg.RegNum);
@@ -491,17 +497,17 @@ struct RISCVOperand final : public MCParsedAsmOperand {
            RISCVMCRegisterClasses[RISCV::GPRF32RegClassID].contains(Reg.RegNum);
   }
 
-  bool isGPRAsFPR() const { return isGPR() && Reg.IsGPRAsFPR; }
-  bool isGPRAsFPR16() const { return isGPRF16() && Reg.IsGPRAsFPR; }
-  bool isGPRAsFPR32() const { return isGPRF32() && Reg.IsGPRAsFPR; }
-  bool isGPRPairAsFPR() const { return isGPRPair() && Reg.IsGPRAsFPR; }
-
-  bool isGPRPair() const {
+  bool isGPRF64Pair() const {
     return Kind == KindTy::Register &&
-           RISCVMCRegisterClasses[RISCV::GPRPairRegClassID].contains(
+           RISCVMCRegisterClasses[RISCV::GPRF64PairRegClassID].contains(
                Reg.RegNum);
   }
 
+  bool isGPRAsFPR() const { return isGPR() && Reg.IsGPRAsFPR; }
+  bool isGPRAsFPR16() const { return isGPRF16() && Reg.IsGPRAsFPR; }
+  bool isGPRAsFPR32() const { return isGPRF32() && Reg.IsGPRAsFPR; }
+  bool isGPRPairAsFPR64() const { return isGPRF64Pair() && Reg.IsGPRAsFPR; }
+
   static bool evaluateConstantImm(const MCExpr *Expr, int64_t &Imm,
                                   RISCVMCExpr::VariantKind &VK) {
     if (auto *RE = dyn_cast<RISCVMCExpr>(Expr)) {
@@ -2399,7 +2405,7 @@ ParseStatus RISCVAsmParser::parseGPRPairAsFPR64(OperandVector &Operands) {
   const MCRegisterInfo *RI = getContext().getRegisterInfo();
   MCRegister Pair = RI->getMatchingSuperReg(
       Reg, RISCV::sub_gpr_even,
-      &RISCVMCRegisterClasses[RISCV::GPRPairRegClassID]);
+      &RISCVMCRegisterClasses[RISCV::GPRF64PairRegClassID]);
   Operands.push_back(RISCVOperand::createReg(Pair, S, E, /*isGPRAsFPR=*/true));
   return ParseStatus::Success;
 }
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index dc3f8254cb4e00..1abb693eb47665 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -953,6 +953,35 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     ReplaceNode(Node, Res);
     return;
   }
+  case RISCVISD::BuildGPRPair: {
+    SDValue Ops[] = {
+        CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32),
+        Node->getOperand(0),
+        CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32),
+        Node->getOperand(1),
+        CurDAG->getTargetConstant(RISCV::sub_gpr_odd, DL, MVT::i32)};
+
+    SDNode *N = CurDAG->getMachineNode(TargetOpcode::REG_SEQUENCE, DL,
+                                       Subtarget->getXLenPairVT(), Ops);
+    ReplaceNode(Node, N);
+    return;
+  }
+  case RISCVISD::SplitGPRPair: {
+    if (!SDValue(Node, 0).use_empty()) {
+      SDValue Lo = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_even, DL, VT,
+                                                  Node->getOperand(0));
+      ReplaceUses(SDValue(Node, 0), Lo);
+    }
+
+    if (!SDValue(Node, 1).use_empty()) {
+      SDValue Hi = CurDAG->getTargetExtractSubreg(RISCV::sub_gpr_odd, DL, VT,
+                                                  Node->getOperand(0));
+      ReplaceUses(SDValue(Node, 1), Hi);
+    }
+
+    CurDAG->RemoveDeadNode(Node);
+    return;
+  }
   case RISCVISD::BuildPairF64: {
     if (!Subtarget->hasStdExtZdinx())
       break;
@@ -960,7 +989,7 @@ void RISCVDAGToDAGISel::Select(SDNode *Node) {
     assert(!Subtarget->is64Bit() && "Unexpected subtarget");
 
     SDValue Ops[] = {
-        CurDAG->getTargetConstant(RISCV::GPRPairRegClassID, DL, MVT::i32),
+        CurDAG->getTargetConstant(RISCV::GPRF64PairRegClassID, DL, MVT::i32),
         Node->getOperand(0),
         CurDAG->getTargetConstant(RISCV::sub_gpr_even, DL, MVT::i32),
         Node->getOperand(1),
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 69112d868bff82..a439cccb38f345 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -114,9 +114,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
   }
 
   MVT XLenVT = Subtarget.getXLenVT();
+  MVT XLenPairVT = Subtarget.getXLenPairVT();
 
   // Set up the register classes.
   addRegisterClass(XLenVT, &RISCV::GPRRegClass);
+  addRegisterClass(XLenPairVT, &RISCV::GPRPairRegClass);
 
   if (Subtarget.hasStdExtZfhmin())
     addRegisterClass(MVT::f16, &RISCV::FPR16RegClass);
@@ -134,7 +136,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
     if (Subtarget.is64Bit())
       addRegisterClass(MVT::f64, &RISCV::GPRRegClass);
     else
-      addRegisterClass(MVT::f64, &RISCV::GPRPairRegClass);
+      addRegisterClass(MVT::f64, &RISCV::GPRF64PairRegClass);
   }
 
   static const MVT::SimpleValueType BoolVecVTs[] = {
@@ -296,6 +298,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
     setCondCodeAction(ISD::SETLE, XLenVT, Expand);
   }
 
+  if (Subtarget.is64Bit())
+    setOperationAction(ISD::BITCAST, MVT::i128, Custom);
+  else
+    setOperationAction(ISD::BITCAST, MVT::i64, Custom);
+
   setOperationAction({ISD::STACKSAVE, ISD::STACKRESTORE}, MVT::Other, Expand);
 
   setOperationAction(ISD::VASTART, MVT::Other, Custom);
@@ -2224,6 +2231,17 @@ bool RISCVTargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
   return Index == 0 || Index == ResElts;
 }
 
+EVT RISCVTargetLowering::getAsmOperandValueType(const DataLayout &DL, Type *Ty,
+                                                bool AllowUnknown) const {
+  if (!Subtarget.is64Bit() && Ty->isIntegerTy(64))
+    return MVT::riscv_i32_pair;
+
+  if (Subtarget.is64Bit() && Ty->isIntegerTy(128))
+    return MVT::riscv_i64_pair;
+
+  return TargetLowering::getAsmOperandValueType(DL, Ty, AllowUnknown);
+}
+
 MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context,
                                                       CallingConv::ID CC,
                                                       EVT VT) const {
@@ -2238,6 +2256,17 @@ MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context,
   return PartVT;
 }
 
+unsigned
+RISCVTargetLowering::getNumRegisters(LLVMContext &Context, EVT VT,
+                                     std::optional<MVT> RegisterVT) const {
+  // Pair inline assembly operand
+  if (VT == (Subtarget.is64Bit() ? MVT::i128 : MVT::i64) && RegisterVT &&
+      *RegisterVT == Subtarget.getXLenPairVT())
+    return 1;
+
+  return TargetLowering::getNumRegisters(Context, VT, RegisterVT);
+}
+
 unsigned RISCVTargetLowering::getNumRegistersForCallingConv(LLVMContext &Context,
                                                            CallingConv::ID CC,
                                                            EVT VT) const {
@@ -2776,6 +2805,19 @@ RISCVTargetLowering::computeVLMAXBounds(MVT VecVT,
   return std::make_pair(MinVLMAX, MaxVLMAX);
 }
 
+bool RISCVTargetLowering::isLoadBitCastBeneficial(
+    EVT LoadVT, EVT BitcastVT, const SelectionDAG &DAG,
+    const MachineMemOperand &MMO) const {
+  // We want to leave `bitcasts` to/from MVT::riscv_i<xlen>_pair separate from
+  // loads/stores so they can be turned into BuildGPRPair/::SplitGPRPair nodes.
+  if (LoadVT == (Subtarget.is64Bit() ? MVT::i128 : MVT::i64) &&
+      BitcastVT == Subtarget.getXLenPairVT())
+    return false;
+
+  return TargetLoweringBase::isLoadBitCastBeneficial(LoadVT, BitcastVT, DAG,
+                                                     MMO);
+}
+
 // The state of RVV BUILD_VECTOR and VECTOR_SHUFFLE lowering is that very few
 // of either is (currently) supported. This can get us into an infinite loop
 // where we try to lower a BUILD_VECTOR as a VECTOR_SHUFFLE as a BUILD_VECTOR
@@ -6413,6 +6455,13 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
       std::tie(Lo, Hi) = DAG.SplitScalar(Op0, DL, MVT::i32, MVT::i32);
       return DAG.getNode(RISCVISD::BuildPairF64, DL, MVT::f64, Lo, Hi);
     }
+    if (VT == Subtarget.getXLenPairVT() && Op0VT.isScalarInteger() &&
+        Op0VT.getSizeInBits() == 2 * Subtarget.getXLen()) {
+      SDValue Lo, Hi;
+      std::tie(Lo, Hi) = DAG.SplitScalar(Op0, DL, XLenVT, XLenVT);
+      return DAG.getNode(RISCVISD::BuildGPRPair, DL, Subtarget.getXLenPairVT(),
+                         Lo, Hi);
+    }
 
     // Consider other scalar<->scalar casts as legal if the types are legal.
     // Otherwise expand them.
@@ -12886,6 +12935,14 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N,
       SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, MVT::i64,
                                    NewReg.getValue(0), NewReg.getValue(1));
       Results.push_back(RetReg);
+    } else if (VT.isInteger() &&
+               VT.getSizeInBits() == 2 * Subtarget.getXLen() &&
+               Op0VT == Subtarget.getXLenPairVT()) {
+      SDValue NewReg = DAG.getNode(RISCVISD::SplitGPRPair, DL,
+                                   DAG.getVTList(XLenVT, XLenVT), Op0);
+      SDValue RetReg = DAG.getNode(ISD::BUILD_PAIR, DL, VT, NewReg.getValue(0),
+                                   NewReg.getValue(1));
+      Results.push_back(RetReg);
     } else if (!VT.isVector() && Op0VT.isFixedLengthVector() &&
                isTypeLegal(Op0VT)) {
       // Custom-legalize bitcasts from fixed-length vector types to illegal
@@ -20130,6 +20187,8 @@ const char *RISCVTargetLowering::getTargetNodeName(unsigned Opcode) const {
   NODE_NAME_CASE(TAIL)
   NODE_NAME_CASE(SELECT_CC)
   NODE_NAME_CASE(BR_CC)
+  NODE_NAME_CASE(BuildGPRPair)
+  NODE_NAME_CASE(SplitGPRPair)
   NODE_NAME_CASE(BuildPairF64)
   NODE_NAME_CASE(SplitF64)
   NODE_NAME_CASE(ADD_LO)
@@ -20408,6 +20467,8 @@ RISCVTargetLowering::getConstraintType(StringRef Constraint) const {
       return C_RegisterClass;
     if (Constraint == "cr" || Constraint == "cf")
       return C_RegisterClass;
+    if (Constraint == "Pr")
+      return C_RegisterClass;
   }
   return TargetLowering::getConstraintType(Constraint);
 }
@@ -20429,7 +20490,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
       if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
         return std::make_pair(0U, &RISCV::GPRF32NoX0RegClass);
       if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
-        return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
+        return std::make_pair(0U, &RISCV::GPRF64PairNoX0RegClass);
       return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
     case 'f':
       if (VT == MVT::f16) {
@@ -20446,7 +20507,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
         if (Subtarget.hasStdExtD())
           return std::make_pair(0U, &RISCV::FPR64RegClass);
         if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
-          return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
+          return std::make_pair(0U, &RISCV::GPRF64PairNoX0RegClass);
         if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
           return std::make_pair(0U, &RISCV::GPRNoX0RegClass);
       }
@@ -20488,7 +20549,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
     if (VT == MVT::f32 && Subtarget.hasStdExtZfinx())
       return std::make_pair(0U, &RISCV::GPRF32CRegClass);
     if (VT == MVT::f64 && Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
-      return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+      return std::make_pair(0U, &RISCV::GPRF64PairCRegClass);
     if (!VT.isVector())
       return std::make_pair(0U, &RISCV::GPRCRegClass);
   } else if (Constraint == "cf") {
@@ -20506,10 +20567,12 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
       if (Subtarget.hasStdExtD())
         return std::make_pair(0U, &RISCV::FPR64CRegClass);
       if (Subtarget.hasStdExtZdinx() && !Subtarget.is64Bit())
-        return std::make_pair(0U, &RISCV::GPRPairCRegClass);
+        return std::make_pair(0U, &RISCV::GPRF64PairCRegClass);
       if (Subtarget.hasStdExtZdinx() && Subtarget.is64Bit())
         return std::make_pair(0U, &RISCV::GPRCRegClass);
     }
+  } else if (Constraint == "Pr") {
+    return std::make_pair(0U, &RISCV::GPRPairNoX0RegClass);
   }
 
   // Clang will correctly decode the usage of register name aliases into their
@@ -20670,7 +20733,7 @@ RISCVTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
   // Subtarget into account.
   if (Res.second == &RISCV::GPRF16RegClass ||
       Res.second == &RISCV::GPRF32RegClass ||
-      Res.second == &RISCV::GPRPairRegClass)
+      Res.second == &RISCV::GPRF64PairRegClass)
     return std::make_pair(Res.first, &RISCV::GPRRegClass);
 
   return Res;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h
index 0b07ad7d7a423f..deaefafc73535e 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.h
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h
@@ -44,6 +44,18 @@ enum NodeType : unsigned {
   SELECT_CC,
   BR_CC,
 
+  /// Turn a pair of `i<xlen>`s into a `riscv_i<xlen>_pair`.
+  /// - Output: `riscv_i<xlen>_pair`
+  /// - Input 0: `i<xlen>` low-order bits, for even register.
+  /// - Input 1: `i<xlen>` high-order bits, for odd register.
+  BuildGPRPair,
+
+  /// Turn a `riscv_i<xlen>_pair` into a pair of `i<xlen>`s.
+  /// - Output 0: `i<xlen>` low-order bits, from even register.
+  /// - Output 1: `i<xlen>` high-order bits, from odd register.
+  /// - Input: `riscv_i<xlen>_pair`
+  SplitGPRPair,
+
   /// Turns a pair of `i32`s into an `f64`. Needed for rv32d/ilp32.
   /// - Output: `f64`.
   /// - Input 0: low-order bits (31-0) (as `i32`), for even register.
@@ -544,11 +556,19 @@ class RISCVTargetLowering : public TargetLowering {
 
 ...
[truncated]

@lenary
Copy link
Member Author

lenary commented Oct 25, 2024

Today's changes address all the CI failures, by adding better checking before turning BITCAST into BuildPairF64 or SplitF64 - the conditions are derived from the conditions on setOperationAction(ISD::BITCAST, MVT::i64, Custom) before this patch added more of the same.

@lenary
Copy link
Member Author

lenary commented Oct 29, 2024

ping?

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I just had a detailed look. Given that you have explained almost all the code detailedly, I think this PR looks great to me!
Just some overall comments:

  1. I personally like your proposal of adding new constraints, but we still need the agreement between community members.
  2. I saw all the comments above and I know reason why we choose to add new MVT types. My question is, maybe we can make it less target-specific? I don't think this is a RISC-V only problem.
  3. We should be able to use Pr for Zacas now? So maybe we should add some tests for it.

Copy link
Member Author

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I just had a detailed look. Given that you have explained almost all the code detailedly, I think this PR looks great to me! Just some overall comments:

  1. I personally like your proposal of adding new constraints, but we still need the agreement between community members.

I think some of the agreement needs an implementation, maybe? I'll ping the c-api issue.

  1. I saw all the comments above and I know reason why we choose to add new MVT types. My question is, maybe we can make it less target-specific? I don't think this is a RISC-V only problem.

I noticed, after-the-fact, that SystemZ uses MVT::Untyped (corrected) for pairs, I can prepare a fixup commit that does this (and removes the riscv_*_pair MVTs), if we think that's a better target-independent approach?

  1. We should be able to use Pr for Zacas now? So maybe we should add some tests for it.
  • The test functions which have an in-out Pr parameter (test_Pr_wide_scalar_inout) are derived from the issue about amocas.q in inline assembly on RV64. The C from that issue found some bugs in the implementation, which is why those tests exist. These don't reference any specific instructions just to make them have less complex dependencies - which is why the text of the inline assembly is just a comment to verify we're producing sensible output.
  • The Zacas pair instructions continue to use GPRPair in their definitions, rather than the new GPRF64Pair register classes, which seemed the right way forwards. There are no patterns for the pair instructions at the moment, but I'm not sure I want to change the codegen of atomics, that's a whole other nightmare. Do you have specific tests in mind?

@@ -2238,6 +2256,17 @@ MVT RISCVTargetLowering::getRegisterTypeForCallingConv(LLVMContext &Context,
return PartVT;
}

unsigned
RISCVTargetLowering::getNumRegisters(LLVMContext &Context, EVT VT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we need this change but AArch64 didn't?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to prevent an assert that is hit in RegsForValue::AddInlineAsmOperands without this code.

I think AArch64 is not hitting this because they didn't test very much - the amocas.q example has a lot of hard cases (multiple returns, matching input/output operands) which hit a lot of the hard cases in SelectionDAGBuilder. Note I had to fix one place where AArch64 was definitely wrong - the changes to call getAsmOperandValueType when there are multiple outputs from asm - AArch64 should have hit this case and didn't.

I've spent a few more hours this evening trying to trace down this assert, and I feel closer, but not quite there. The testcase to use to get where I've got to (with the assert) is test_Pr_wide_scalar_inout.

SelectionDAGBuilder.cpp's getRegistersForValue seems to be doing the right thing, always calling this with the VT and RegisterVT with the same values, as far as I can tell. It seems to create the OpInfo.AssignedRegs in such a way that the later call to getNumRegisters in RegsForValue::AddInlineAsmOperands will get the same value as it did when it was created.

Where this seems to go wrong is the matching inputs code in SelectionDAGBuilder.cpp -

MachineFunction &MF = DAG.getMachineFunction();
MachineRegisterInfo &MRI = MF.getRegInfo();
const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
auto *R = cast<RegisterSDNode>(AsmNodeOperands[CurOp+1]);
Register TiedReg = R->getReg();
MVT RegVT = R->getSimpleValueType(0);
const TargetRegisterClass *RC =
TiedReg.isVirtual() ? MRI.getRegClass(TiedReg)
: RegVT != MVT::Untyped ? TLI.getRegClassFor(RegVT)
: TRI.getMinimalPhysRegClass(TiedReg);
for (unsigned i = 0, e = Flag.getNumOperandRegisters(); i != e; ++i)
Regs.push_back(MRI.createVirtualRegister(RC));
RegsForValue MatchedRegs(Regs, RegVT, InOperandVal.getValueType());
SDLoc dl = getCurSDLoc();
// Use the produced MatchedRegs object to
MatchedRegs.getCopyToRegs(InOperandVal, DAG, dl, Chain, &Glue, &Call);
MatchedRegs.AddInlineAsmOperands(InlineAsm::Kind::RegUse, true,
OpInfo.getMatchedOperand(), dl, DAG,
AsmNodeOperands);

This seems to have its own logic about creating the RegsForValue MatchedRegs(...) based on the surrounding DAG Context, and seems to avoid the info available in OpInfo which should have some relevancy, instead this code is directly pulling out similar (but not quite identical) information from the DAG. When getNumRegisters is eventually called in MatchedRegs::AddInlineAsmOperands, it uses MVT::i128 rather than MVT::riscv_i64_pair, which means it thinks it needs two registers, not just one. I think this MVT::i128 is coming from InOperandVal.getValueType() when MatchedRegs is created.

When this code is compared to getRegistersForValue, the two really don't seem to line up, logically - I would expect them both to be doing similar things, but they're not - how the RegsForValue object is created in getRegistersForValue is reasonably different to this logic.

git-blame-ing this code, it seems to have been introduced for supporting i128 on SystemZ - presumably also for paired operands - and they do have a getNumRegisters override, which returns 1 - https://reviews.llvm.org/D100788 (refactors have hit a good number of lines around the SelectionDAGBuilder.cpp visitInlineAsm code I believe to be at fault, but nothing that's not NFC since that change).

I don't feel good about the matching inputs code in SelectionDAGBuilder - I don't understand it, and I don't think my addition to getNumRegisters should be necessary if SelectionDAGBuilder.cpp worked more closely to how getRegistersForValue works. The extra confusing thing here is that getRegistersForValue might have mutated OpInfo when there's a matched input - but returned std::nullopt, so maybe the logic for matched inputs needs to just match the tail end of getRegistersForValue. I'm not entirely sure.

You're a lot more familiar with SelectionDAG than I am, do you have advice for what I should be doing here? It does seem like more target-independent code needs to be fixed to get rid of this. Maybe that can come later?

@lenary
Copy link
Member Author

lenary commented Nov 5, 2024

Gentle Ping. I'm looking for answers to two questions:

  • [Should I] prepare a fixup commit that [uses MVT::Untyped] (and removes the riscv_*_pair MVTs), if we think that's a better target-independent approach?

  • Any advice on whether I should be digging deeply into changing SelectionDAGBuilder.cpp (affecting all targets) so we don't need the getNumRegisters override? The options here IMO are a) live with the override, b) change SelectionDAGBuilder now, c) change SelectionDAGBuilder in a follow-up and remove the override once it is not needed.

@topperc
Copy link
Collaborator

topperc commented Nov 5, 2024

Gentle Ping. I'm looking for answers to two questions:

  • [Should I] prepare a fixup commit that [uses MVT::Untyped] (and removes the riscv_*_pair MVTs), if we think that's a better target-independent approach?

I guess so. I didn't know about the SystemZ change before. I do find it a little weird having 2 register classes with the same spill size and registers. I believe the tablegen generated code considers them subclasses of each other. X86 has some register classes like this too so I guess its not a big deal.

  • Any advice on whether I should be digging deeply into changing SelectionDAGBuilder.cpp (affecting all targets) so we don't need the getNumRegisters override? The options here IMO are a) live with the override, b) change SelectionDAGBuilder now, c) change SelectionDAGBuilder in a follow-up and remove the override once it is not needed.

I think we can live with the override since SystemZ is doing it too.

@lenary
Copy link
Member Author

lenary commented Nov 12, 2024

I'm working on this again and tripping over the commit that changes the branch relaxation tests, so I'm planning to just push that to main (it's NFC) and rebase. Sorry for the PR noise this will generate. I'll keep the "attempt to use Untyped" commit separate when I re-push to make reviews easier.

@lenary
Copy link
Member Author

lenary commented Nov 12, 2024

Gentle Ping. I'm looking for answers to two questions:

  • [Should I] prepare a fixup commit that [uses MVT::Untyped] (and removes the riscv_*_pair MVTs), if we think that's a better target-independent approach?

I guess so. I didn't know about the SystemZ change before.

I'll try it, and push a separate commit, so we can see what else is needed.

I do find it a little weird having 2 register classes with the same spill size and registers. I believe the tablegen generated code considers them subclasses of each other. X86 has some register classes like this too so I guess its not a big deal.

I'm not very happy about the duplication either, but it's because of a limitation where fairly early on, selectiondag uses the first legal type in a class, rather than the one you might want.

  • Any advice on whether I should be digging deeply into changing SelectionDAGBuilder.cpp (affecting all targets) so we don't need the getNumRegisters override? The options here IMO are a) live with the override, b) change SelectionDAGBuilder now, c) change SelectionDAGBuilder in a follow-up and remove the override once it is not needed.

I think we can live with the override since SystemZ is doing it too.

@lenary lenary force-pushed the pr/riscv-inline-asm-pairs branch from 841e38a to 126e4ee Compare November 12, 2024 19:41
@lenary
Copy link
Member Author

lenary commented Nov 12, 2024

The new approach using MVT::Untyped is uploaded. Inspect commit 126e4ee to see what changed to go from the riscv_i<xlen>_pair approach to the one using untyped.

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers! It seems using untyped saves some lines as well!
LGTM with nits.

ReplaceNode(Node, N);
return;
}
case RISCVISD::SplitGPRPair: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we share this code with SplitF64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Happy to in a follow up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing this as I have to do a big rebase to sort out the comments/reorg of RISCVRegisterinfo.td

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM we can do the code sharing as a follow up if we want

@lenary
Copy link
Member Author

lenary commented Nov 13, 2024

I will rewrite the description of the PR momentarily, once I've rebased over the cleanup/comments to RISCVRegisterinfo.td

This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `Pr` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `Pr`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `Pr` is a valid inline assembly
  constraint.

- Adds Conditions to the GPRF64Pair-related changes to `LowerOperation`
  and `ReplaceNodeResults` which match when BITCAST for the relevant
  types should be handled in a custom manner.

- This also allows `Pr` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
@lenary lenary force-pushed the pr/riscv-inline-asm-pairs branch from 8d538f8 to 0b98a56 Compare November 13, 2024 16:37
@lenary
Copy link
Member Author

lenary commented Nov 13, 2024

The commit I just pushed up should be the final version, and not fundamentally different from what was reviewed (but with various NFC patches split out).

I think I do have a route towards merging GPRPair and GPRF64Pair, but I'll do that in a follow-up, as there's a mystery crash somewhere that I haven't tracked down yet.

@lenary
Copy link
Member Author

lenary commented Nov 13, 2024

I'm not sure this requires re-review, so I plan to land this tomorrow (UK time) if I don't hear anything else.

[RISCV] GPR Pairs for Inline Asm using `R`

This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `R` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `R`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `R` is a valid inline assembly
  constraint.

- Adds Conditions to the GPRF64Pair-related changes to `LowerOperation`
  and `ReplaceNodeResults` which match when BITCAST for the relevant
  types should be handled in a custom manner.

- This also allows `R` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
@lenary
Copy link
Member Author

lenary commented Nov 14, 2024

I won't be merging this today. Instead, two changes/updates:

@lenary
Copy link
Member Author

lenary commented Nov 18, 2024

Kito and I have agreed R is a good way forwards for GPR pairs, and the c-api-doc PR is approved (but not landed). I'll reword the description/message to update our constraint choice when I land this, which I hope to get to later today.

@lenary lenary changed the title [RISCV] Inline Assembly Support for GPR Pairs ('Pr') [RISCV] Inline Assembly Support for GPR Pairs ('R') Nov 18, 2024
@lenary lenary merged commit 4615cc3 into llvm:main Nov 18, 2024
5 of 8 checks passed
@lenary lenary deleted the pr/riscv-inline-asm-pairs branch November 18, 2024 17:46
lenary added a commit to lenary/llvm-project that referenced this pull request Nov 18, 2024
This was a typo in llvm#112983 that didn't cause build
failures but is still wrong.
lenary added a commit that referenced this pull request Nov 19, 2024
This was a typo in #112983 that didn't cause build
failures but is still wrong.
lenary added a commit that referenced this pull request Nov 20, 2024
As suggested by Craig, this tries to merge the two sets of register
classes created in #112983, GPRPair* and GPRF64Pair*.

- I added some explicit annotations to `RISCVInstrInfoD.td` which fixed
the type inference issues I was seeing from tablegen for select
patterns.
- I've had to make the behaviour of `splitValueIntoRegisterParts` and
`joinRegisterPartsIntoValue` cover more cases, because you cannot
bitcast to/from untyped (the bitcast would otherwise have been inserted
automatically by TargetLowering code).
- I apparently didn't need to change `getNumRegisters` again, which
continues to tell me there's a bug in the code for tied inputs. I added
some more test coverage of this case but it didn't seem to help find the
asserts I was finding before - I think the difference is between the
default behaviour for integers which doesn't apply to floats.
- There's still a difference between BuildGPRPair and BuildPairF64 (and
the same for SplitGPRPair and SplitF64). I'm not happy with this, I
think it's quite confusing, as they're very similar, just differing in
whether they give a `untyped` or a `f64`. I haven't really worked out
how the DAGCombiner copes if one meets the other, I know we have some of
this for the f64 variants already, but they're a lot more complex than
the GPRPair variants anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Wrong register usage for amocas.q.aqrl inline assembly
4 participants