Skip to content

[AArch64] Don't use LowerToPredicatedOp in shufflevector -> SVE lowerings #140713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 23, 2025

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented May 20, 2025

The use of LowerToPredicatedOp here seems like a mistake as LowerToPredicatedOp turns the SDValue passed to it into the desired predicated node by copying over operands (and adding a predicate). This results in two odd things here, the BITCASTs created and passed to LowerToPredicatedOp are not used, only the operands of those bitcasts are taken. Secondly, when a shuffle vector node is passed directly to LowerToPredicatedOp to create a REVD_MERGE_PASSTHRU node an invalid REV node is created as REV only takes one vector operand, but both operands from the shuffle vector are copied to the new REV node. This is not an issue in practice as the extra operand is ignored.

These issues were found by the verification added in #140472.

Part of #140472.

Note: Test changes only result in the vxf64 lowering matching the vxi64 lowering.

…ings

The use of `LowerToPredicatedOp` here seems like a mistake as
`LowerToPredicatedOp` turns the SDValue passed to it into the desired
predicated node by copying over operands (and adding a predicate). This
results in two odd things here, the BITCASTs created and passed to
`LowerToPredicatedOp` are not used, only the operands of those bitcasts
are taken. Secondly, when a shuffle vector node is passed directly to
`LowerToPredicatedOp` to create a `REVD_MERGE_PASSTHRU` node
an invalid REV node is created as REV only takes one vector operand,
but both operands from the shuffle vector are copied to the new REV node.
This is not an issue in practice as the extra operand is ignored.

These issues were found by the verification added in llvm#140472.

Part of llvm#140472.

Note: Test changes only result in the vxi64 lowering matching the vxf64
lowering.
@llvmbot
Copy link
Member

llvmbot commented May 20, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Benjamin Maxwell (MacDue)

Changes

The use of LowerToPredicatedOp here seems like a mistake as LowerToPredicatedOp turns the SDValue passed to it into the desired predicated node by copying over operands (and adding a predicate). This results in two odd things here, the BITCASTs created and passed to LowerToPredicatedOp are not used, only the operands of those bitcasts are taken. Secondly, when a shuffle vector node is passed directly to LowerToPredicatedOp to create a REVD_MERGE_PASSTHRU node an invalid REV node is created as REV only takes one vector operand, but both operands from the shuffle vector are copied to the new REV node. This is not an issue in practice as the extra operand is ignored.

These issues were found by the verification added in #140472.

Part of #140472.

Note: Test changes only result in the vxi64 lowering matching the vxf64 lowering.


Full diff: https://github.com/llvm/llvm-project/pull/140713.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+14-15)
  • (modified) llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll (+2-1)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll (+2-2)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 293292d47dd48..99122e2897285 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -29773,11 +29773,18 @@ SDValue AArch64TargetLowering::LowerFixedLengthVECTOR_SHUFFLEToSVE(
     return convertFromScalableVector(DAG, VT, Op);
   }
 
+  auto lowerToRevMergePassthru = [&](unsigned Opcode, SDValue Vec, EVT NewVT) {
+    auto Pg = getPredicateForVector(DAG, DL, NewVT);
+    SDValue RevOp = DAG.getNode(ISD::BITCAST, DL, NewVT, Vec);
+    auto Rev =
+        DAG.getNode(Opcode, DL, NewVT, Pg, RevOp, DAG.getUNDEF(ContainerVT));
+    auto Cast = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Rev);
+    return convertFromScalableVector(DAG, VT, Cast);
+  };
+
   unsigned EltSize = VT.getScalarSizeInBits();
   for (unsigned LaneSize : {64U, 32U, 16U}) {
     if (isREVMask(ShuffleMask, EltSize, VT.getVectorNumElements(), LaneSize)) {
-      EVT NewVT =
-          getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), LaneSize));
       unsigned RevOp;
       if (EltSize == 8)
         RevOp = AArch64ISD::BSWAP_MERGE_PASSTHRU;
@@ -29785,24 +29792,16 @@ SDValue AArch64TargetLowering::LowerFixedLengthVECTOR_SHUFFLEToSVE(
         RevOp = AArch64ISD::REVH_MERGE_PASSTHRU;
       else
         RevOp = AArch64ISD::REVW_MERGE_PASSTHRU;
-
-      Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1);
-      Op = LowerToPredicatedOp(Op, DAG, RevOp);
-      Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);
-      return convertFromScalableVector(DAG, VT, Op);
+      EVT NewVT =
+          getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), LaneSize));
+      return lowerToRevMergePassthru(RevOp, Op1, NewVT);
     }
   }
 
   if (Subtarget->hasSVE2p1() && EltSize == 64 &&
       isREVMask(ShuffleMask, EltSize, VT.getVectorNumElements(), 128)) {
-    if (!VT.isFloatingPoint())
-      return LowerToPredicatedOp(Op, DAG, AArch64ISD::REVD_MERGE_PASSTHRU);
-
-    EVT NewVT = getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), 64));
-    Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1);
-    Op = LowerToPredicatedOp(Op, DAG, AArch64ISD::REVD_MERGE_PASSTHRU);
-    Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);
-    return convertFromScalableVector(DAG, VT, Op);
+    return lowerToRevMergePassthru(AArch64ISD::REVD_MERGE_PASSTHRU, Op1,
+                                   ContainerVT);
   }
 
   unsigned WhichResult;
diff --git a/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll b/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
index 0cda4d94444e9..faf82d4945b3d 100644
--- a/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
+++ b/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
@@ -213,8 +213,9 @@ define void @test_revdv4i64_sve2p1(ptr %a) #2 {
 ; CHECK-LABEL: test_revdv4i64_sve2p1:
 ; CHECK:       // %bb.0:
 ; CHECK-NEXT:    ptrue p0.d, vl4
+; CHECK-NEXT:    ptrue p1.d
 ; CHECK-NEXT:    ld1d { z0.d }, p0/z, [x0]
-; CHECK-NEXT:    revd z0.q, p0/m, z0.q
+; CHECK-NEXT:    revd z0.q, p1/m, z0.q
 ; CHECK-NEXT:    st1d { z0.d }, p0, [x0]
 ; CHECK-NEXT:    ret
   %tmp1 = load <4 x i64>, ptr %a
diff --git a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
index c364abf2916e8..d8f83834a1bca 100644
--- a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
@@ -677,7 +677,7 @@ define void @test_revdv4i64_sve2p1(ptr %a) #1 {
 ; CHECK-LABEL: test_revdv4i64_sve2p1:
 ; CHECK:       // %bb.0:
 ; CHECK-NEXT:    ldp q0, q1, [x0]
-; CHECK-NEXT:    ptrue p0.d, vl2
+; CHECK-NEXT:    ptrue p0.d
 ; CHECK-NEXT:    revd z0.q, p0/m, z0.q
 ; CHECK-NEXT:    revd z1.q, p0/m, z1.q
 ; CHECK-NEXT:    stp q0, q1, [x0]
@@ -686,7 +686,7 @@ define void @test_revdv4i64_sve2p1(ptr %a) #1 {
 ; NONEON-NOSVE-LABEL: test_revdv4i64_sve2p1:
 ; NONEON-NOSVE:       // %bb.0:
 ; NONEON-NOSVE-NEXT:    ldp q0, q1, [x0]
-; NONEON-NOSVE-NEXT:    ptrue p0.d, vl2
+; NONEON-NOSVE-NEXT:    ptrue p0.d
 ; NONEON-NOSVE-NEXT:    revd z0.q, p0/m, z0.q
 ; NONEON-NOSVE-NEXT:    revd z1.q, p0/m, z1.q
 ; NONEON-NOSVE-NEXT:    stp q0, q1, [x0]

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the BITCASTs created and passed to LowerToPredicatedOp are not used, only the operands of those bitcasts are taken

Can you point me to where that happens? I'm looking at the implementation of LowerToPredicatedOp and I don't see it trying to look through the bitcasts?

@MacDue
Copy link
Member Author

MacDue commented May 20, 2025

Can you point me to where that happens? I'm looking at the implementation of LowerToPredicatedOp and I don't see it trying to look through the bitcasts?

The value "Op" passed to LowerToPredicatedOp is not used as an operand of the node LowerToPredicatedOp returns, instead LowerToPredicatedOp turns the operand into the new node. So when you create a new BITCAST node and then immediately pass that node to LowerToPredicatedOp the result is not a new node that uses that BITCAST, it's a node that takes all its operands from the BITCAST + some predicate and replaces the opcode, the BITCAST node is unused.

That seems like a pretty odd use to me and seems unintentional.

@MacDue MacDue merged commit 62cae9c into llvm:main May 23, 2025
9 of 11 checks passed
@MacDue MacDue deleted the rev_fix branch May 23, 2025 07:43
MacDue added a commit that referenced this pull request May 28, 2025
This continues s-barannikov's work TableGen-erating SDNode descriptions. 
This takes the initial patch from #119709 and moves documentation and the
rest of the AArch64ISD nodes to TableGen. Some issues were found by the
generated SDNode verification added in this patch. These issues have been 
described and fixed in the following PRs:

- #140706 
- #140711 
- #140713 
- #140715

---------

Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
@MacDue MacDue changed the title [AArch64] Don't use LowerToPredicatedOp to shufflevector -> SVE lowerings [AArch64] Don't use LowerToPredicatedOp in shufflevector -> SVE lowerings May 28, 2025
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…ings (llvm#140713)

The use of `LowerToPredicatedOp` here seems like a mistake as
`LowerToPredicatedOp` turns the SDValue passed to it into the desired
predicated node by copying over operands (and adding a predicate). This
results in two odd things here, the BITCASTs created and passed to
`LowerToPredicatedOp` are not used, only the operands of those bitcasts
are taken. Secondly, when a shuffle vector node is passed directly to
`LowerToPredicatedOp` to create a `REVD_MERGE_PASSTHRU` node an invalid
REV node is created as REV only takes one vector operand, but both
operands from the shuffle vector are copied to the new REV node. This is
not an issue in practice as the extra operand is ignored.

These issues were found by the verification added in llvm#140472.

Part of llvm#140472.

Note: Test changes only result in the vxf64 lowering matching the vxi64
lowering.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
This continues s-barannikov's work TableGen-erating SDNode descriptions. 
This takes the initial patch from llvm#119709 and moves documentation and the
rest of the AArch64ISD nodes to TableGen. Some issues were found by the
generated SDNode verification added in this patch. These issues have been 
described and fixed in the following PRs:

- llvm#140706 
- llvm#140711 
- llvm#140713 
- llvm#140715

---------

Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants