-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[AArch64] Don't use LowerToPredicatedOp in shufflevector -> SVE lowerings #140713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ings The use of `LowerToPredicatedOp` here seems like a mistake as `LowerToPredicatedOp` turns the SDValue passed to it into the desired predicated node by copying over operands (and adding a predicate). This results in two odd things here, the BITCASTs created and passed to `LowerToPredicatedOp` are not used, only the operands of those bitcasts are taken. Secondly, when a shuffle vector node is passed directly to `LowerToPredicatedOp` to create a `REVD_MERGE_PASSTHRU` node an invalid REV node is created as REV only takes one vector operand, but both operands from the shuffle vector are copied to the new REV node. This is not an issue in practice as the extra operand is ignored. These issues were found by the verification added in llvm#140472. Part of llvm#140472. Note: Test changes only result in the vxi64 lowering matching the vxf64 lowering.
@llvm/pr-subscribers-backend-aarch64 Author: Benjamin Maxwell (MacDue) ChangesThe use of These issues were found by the verification added in #140472. Part of #140472. Note: Test changes only result in the vxi64 lowering matching the vxf64 lowering. Full diff: https://github.com/llvm/llvm-project/pull/140713.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 293292d47dd48..99122e2897285 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -29773,11 +29773,18 @@ SDValue AArch64TargetLowering::LowerFixedLengthVECTOR_SHUFFLEToSVE(
return convertFromScalableVector(DAG, VT, Op);
}
+ auto lowerToRevMergePassthru = [&](unsigned Opcode, SDValue Vec, EVT NewVT) {
+ auto Pg = getPredicateForVector(DAG, DL, NewVT);
+ SDValue RevOp = DAG.getNode(ISD::BITCAST, DL, NewVT, Vec);
+ auto Rev =
+ DAG.getNode(Opcode, DL, NewVT, Pg, RevOp, DAG.getUNDEF(ContainerVT));
+ auto Cast = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Rev);
+ return convertFromScalableVector(DAG, VT, Cast);
+ };
+
unsigned EltSize = VT.getScalarSizeInBits();
for (unsigned LaneSize : {64U, 32U, 16U}) {
if (isREVMask(ShuffleMask, EltSize, VT.getVectorNumElements(), LaneSize)) {
- EVT NewVT =
- getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), LaneSize));
unsigned RevOp;
if (EltSize == 8)
RevOp = AArch64ISD::BSWAP_MERGE_PASSTHRU;
@@ -29785,24 +29792,16 @@ SDValue AArch64TargetLowering::LowerFixedLengthVECTOR_SHUFFLEToSVE(
RevOp = AArch64ISD::REVH_MERGE_PASSTHRU;
else
RevOp = AArch64ISD::REVW_MERGE_PASSTHRU;
-
- Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1);
- Op = LowerToPredicatedOp(Op, DAG, RevOp);
- Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);
- return convertFromScalableVector(DAG, VT, Op);
+ EVT NewVT =
+ getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), LaneSize));
+ return lowerToRevMergePassthru(RevOp, Op1, NewVT);
}
}
if (Subtarget->hasSVE2p1() && EltSize == 64 &&
isREVMask(ShuffleMask, EltSize, VT.getVectorNumElements(), 128)) {
- if (!VT.isFloatingPoint())
- return LowerToPredicatedOp(Op, DAG, AArch64ISD::REVD_MERGE_PASSTHRU);
-
- EVT NewVT = getPackedSVEVectorVT(EVT::getIntegerVT(*DAG.getContext(), 64));
- Op = DAG.getNode(ISD::BITCAST, DL, NewVT, Op1);
- Op = LowerToPredicatedOp(Op, DAG, AArch64ISD::REVD_MERGE_PASSTHRU);
- Op = DAG.getNode(ISD::BITCAST, DL, ContainerVT, Op);
- return convertFromScalableVector(DAG, VT, Op);
+ return lowerToRevMergePassthru(AArch64ISD::REVD_MERGE_PASSTHRU, Op1,
+ ContainerVT);
}
unsigned WhichResult;
diff --git a/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll b/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
index 0cda4d94444e9..faf82d4945b3d 100644
--- a/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
+++ b/llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
@@ -213,8 +213,9 @@ define void @test_revdv4i64_sve2p1(ptr %a) #2 {
; CHECK-LABEL: test_revdv4i64_sve2p1:
; CHECK: // %bb.0:
; CHECK-NEXT: ptrue p0.d, vl4
+; CHECK-NEXT: ptrue p1.d
; CHECK-NEXT: ld1d { z0.d }, p0/z, [x0]
-; CHECK-NEXT: revd z0.q, p0/m, z0.q
+; CHECK-NEXT: revd z0.q, p1/m, z0.q
; CHECK-NEXT: st1d { z0.d }, p0, [x0]
; CHECK-NEXT: ret
%tmp1 = load <4 x i64>, ptr %a
diff --git a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
index c364abf2916e8..d8f83834a1bca 100644
--- a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-permute-rev.ll
@@ -677,7 +677,7 @@ define void @test_revdv4i64_sve2p1(ptr %a) #1 {
; CHECK-LABEL: test_revdv4i64_sve2p1:
; CHECK: // %bb.0:
; CHECK-NEXT: ldp q0, q1, [x0]
-; CHECK-NEXT: ptrue p0.d, vl2
+; CHECK-NEXT: ptrue p0.d
; CHECK-NEXT: revd z0.q, p0/m, z0.q
; CHECK-NEXT: revd z1.q, p0/m, z1.q
; CHECK-NEXT: stp q0, q1, [x0]
@@ -686,7 +686,7 @@ define void @test_revdv4i64_sve2p1(ptr %a) #1 {
; NONEON-NOSVE-LABEL: test_revdv4i64_sve2p1:
; NONEON-NOSVE: // %bb.0:
; NONEON-NOSVE-NEXT: ldp q0, q1, [x0]
-; NONEON-NOSVE-NEXT: ptrue p0.d, vl2
+; NONEON-NOSVE-NEXT: ptrue p0.d
; NONEON-NOSVE-NEXT: revd z0.q, p0/m, z0.q
; NONEON-NOSVE-NEXT: revd z1.q, p0/m, z1.q
; NONEON-NOSVE-NEXT: stp q0, q1, [x0]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the BITCASTs created and passed to
LowerToPredicatedOp
are not used, only the operands of those bitcasts are taken
Can you point me to where that happens? I'm looking at the implementation of LowerToPredicatedOp
and I don't see it trying to look through the bitcasts?
The value " That seems like a pretty odd use to me and seems unintentional. |
This continues s-barannikov's work TableGen-erating SDNode descriptions. This takes the initial patch from #119709 and moves documentation and the rest of the AArch64ISD nodes to TableGen. Some issues were found by the generated SDNode verification added in this patch. These issues have been described and fixed in the following PRs: - #140706 - #140711 - #140713 - #140715 --------- Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
…ings (llvm#140713) The use of `LowerToPredicatedOp` here seems like a mistake as `LowerToPredicatedOp` turns the SDValue passed to it into the desired predicated node by copying over operands (and adding a predicate). This results in two odd things here, the BITCASTs created and passed to `LowerToPredicatedOp` are not used, only the operands of those bitcasts are taken. Secondly, when a shuffle vector node is passed directly to `LowerToPredicatedOp` to create a `REVD_MERGE_PASSTHRU` node an invalid REV node is created as REV only takes one vector operand, but both operands from the shuffle vector are copied to the new REV node. This is not an issue in practice as the extra operand is ignored. These issues were found by the verification added in llvm#140472. Part of llvm#140472. Note: Test changes only result in the vxf64 lowering matching the vxi64 lowering.
This continues s-barannikov's work TableGen-erating SDNode descriptions. This takes the initial patch from llvm#119709 and moves documentation and the rest of the AArch64ISD nodes to TableGen. Some issues were found by the generated SDNode verification added in this patch. These issues have been described and fixed in the following PRs: - llvm#140706 - llvm#140711 - llvm#140713 - llvm#140715 --------- Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
The use of
LowerToPredicatedOp
here seems like a mistake asLowerToPredicatedOp
turns the SDValue passed to it into the desired predicated node by copying over operands (and adding a predicate). This results in two odd things here, the BITCASTs created and passed toLowerToPredicatedOp
are not used, only the operands of those bitcasts are taken. Secondly, when a shuffle vector node is passed directly toLowerToPredicatedOp
to create aREVD_MERGE_PASSTHRU
node an invalid REV node is created as REV only takes one vector operand, but both operands from the shuffle vector are copied to the new REV node. This is not an issue in practice as the extra operand is ignored.These issues were found by the verification added in #140472.
Part of #140472.
Note: Test changes only result in the vxf64 lowering matching the vxi64 lowering.