-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AArch64] Add SUBS(CSEL) fold from brcond. #142103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesThis folds away subs(csel(1, 0, cc)) from brcond, that can be produced in certain places from compares that are not already subs (like adc/sbc generated from i128 add_with_overflow intrinsics). Full diff: https://github.com/llvm/llvm-project/pull/142103.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index f18d325148742..ff2bcec78760d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24458,6 +24458,17 @@ static SDValue performBRCONDCombine(SDNode *N,
if (CC != AArch64CC::EQ && CC != AArch64CC::NE)
return SDValue();
+ // Fold away brcond(NE, subs(csel(1, 0, CC, Cmp), 1)) -> brcond(CC, Cmp)
+ if (CC == AArch64CC::NE && isCMP(Cmp) && isOneConstant(Cmp.getOperand(1)) &&
+ Cmp.getOperand(0).getOpcode() == AArch64ISD::CSEL) {
+ SDValue CSel = Cmp.getOperand(0);
+ if (isOneConstant(CSel.getOperand(0)) &&
+ isNullConstant(CSel.getOperand(1))) {
+ return DAG.getNode(N->getOpcode(), SDLoc(N), N->getVTList(), Chain, Dest,
+ CSel.getOperand(2), CSel.getOperand(3));
+ }
+ }
+
unsigned CmpOpc = Cmp.getOpcode();
if (CmpOpc != AArch64ISD::ADDS && CmpOpc != AArch64ISD::SUBS)
return SDValue();
diff --git a/llvm/test/CodeGen/AArch64/i128_with_overflow.ll b/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
index e3a8feebaa606..f677b11c1b118 100644
--- a/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
+++ b/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
@@ -10,9 +10,7 @@ define i128 @test_uadd_i128(i128 noundef %x, i128 noundef %y) {
; CHECK-SD: // %bb.0: // %entry
; CHECK-SD-NEXT: adds x0, x0, x2
; CHECK-SD-NEXT: adcs x1, x1, x3
-; CHECK-SD-NEXT: cset w8, hs
-; CHECK-SD-NEXT: cmp w8, #1
-; CHECK-SD-NEXT: b.ne .LBB0_2
+; CHECK-SD-NEXT: b.hs .LBB0_2
; CHECK-SD-NEXT: // %bb.1: // %if.then
; CHECK-SD-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
; CHECK-SD-NEXT: .cfi_def_cfa_offset 16
@@ -66,9 +64,7 @@ define i128 @test_sadd_i128(i128 noundef %x, i128 noundef %y) {
; CHECK-SD: // %bb.0: // %entry
; CHECK-SD-NEXT: adds x0, x0, x2
; CHECK-SD-NEXT: adcs x1, x1, x3
-; CHECK-SD-NEXT: cset w8, vs
-; CHECK-SD-NEXT: cmp w8, #1
-; CHECK-SD-NEXT: b.ne .LBB1_2
+; CHECK-SD-NEXT: b.vs .LBB1_2
; CHECK-SD-NEXT: // %bb.1: // %if.then
; CHECK-SD-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
; CHECK-SD-NEXT: .cfi_def_cfa_offset 16
@@ -122,9 +118,7 @@ define i128 @test_usub_i128(i128 noundef %x, i128 noundef %y) {
; CHECK-SD: // %bb.0: // %entry
; CHECK-SD-NEXT: subs x0, x0, x2
; CHECK-SD-NEXT: sbcs x1, x1, x3
-; CHECK-SD-NEXT: cset w8, lo
-; CHECK-SD-NEXT: cmp w8, #1
-; CHECK-SD-NEXT: b.ne .LBB2_2
+; CHECK-SD-NEXT: b.lo .LBB2_2
; CHECK-SD-NEXT: // %bb.1: // %if.then
; CHECK-SD-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
; CHECK-SD-NEXT: .cfi_def_cfa_offset 16
@@ -178,9 +172,7 @@ define i128 @test_ssub_i128(i128 noundef %x, i128 noundef %y) {
; CHECK-SD: // %bb.0: // %entry
; CHECK-SD-NEXT: subs x0, x0, x2
; CHECK-SD-NEXT: sbcs x1, x1, x3
-; CHECK-SD-NEXT: cset w8, vs
-; CHECK-SD-NEXT: cmp w8, #1
-; CHECK-SD-NEXT: b.ne .LBB3_2
+; CHECK-SD-NEXT: b.vs .LBB3_2
; CHECK-SD-NEXT: // %bb.1: // %if.then
; CHECK-SD-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
; CHECK-SD-NEXT: .cfi_def_cfa_offset 16
|
is csel(0, 1, CC, Cmp) canonicalized or changed to be 1, 0? Can we catch that too? |
We do quite a lot of canonicalization of the IR and of the target independent setccs - it can be quite hard to test the different variations as they don't often come up. I haven't seen any other combos in real code (that were not from O0), either from the other variants of this or from csel of csel. I did manage to find a function that was helpful for handling some of the variants of cset. Canonical csel might be a nice idea - it could help remove a few tablegen patterns. When I tried it it got stuck in an infinite loop though. |
This folds away subs(csel(1, 0, cc)) from brcond, that can be produced in certain places from compares that are not already subs (like adc/sbc generated from i128 add_with_overflow intrinsics).
6d23238
to
ab4aaa8
Compare
This folds away subs(csel(1, 0, cc)) from brcond, that can be produced in certain places from compares that are not already subs (like adc/sbc generated from i128 add_with_overflow intrinsics).