Skip to content

[AArch64] Add SUBS(CSEL) fold from brcond. #142103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

davemgreen
Copy link
Collaborator

This folds away subs(csel(1, 0, cc)) from brcond, that can be produced in certain places from compares that are not already subs (like adc/sbc generated from i128 add_with_overflow intrinsics).

@llvmbot
Copy link
Member

llvmbot commented May 30, 2025

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

This folds away subs(csel(1, 0, cc)) from brcond, that can be produced in certain places from compares that are not already subs (like adc/sbc generated from i128 add_with_overflow intrinsics).


Full diff: https://github.com/llvm/llvm-project/pull/142103.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+11)
  • (modified) llvm/test/CodeGen/AArch64/i128_with_overflow.ll (+4-12)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index f18d325148742..ff2bcec78760d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24458,6 +24458,17 @@ static SDValue performBRCONDCombine(SDNode *N,
   if (CC != AArch64CC::EQ && CC != AArch64CC::NE)
     return SDValue();
 
+  // Fold away brcond(NE, subs(csel(1, 0, CC, Cmp), 1)) -> brcond(CC, Cmp)
+  if (CC == AArch64CC::NE && isCMP(Cmp) && isOneConstant(Cmp.getOperand(1)) &&
+      Cmp.getOperand(0).getOpcode() == AArch64ISD::CSEL) {
+    SDValue CSel = Cmp.getOperand(0);
+    if (isOneConstant(CSel.getOperand(0)) &&
+        isNullConstant(CSel.getOperand(1))) {
+      return DAG.getNode(N->getOpcode(), SDLoc(N), N->getVTList(), Chain, Dest,
+                         CSel.getOperand(2), CSel.getOperand(3));
+    }
+  }
+
   unsigned CmpOpc = Cmp.getOpcode();
   if (CmpOpc != AArch64ISD::ADDS && CmpOpc != AArch64ISD::SUBS)
     return SDValue();
diff --git a/llvm/test/CodeGen/AArch64/i128_with_overflow.ll b/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
index e3a8feebaa606..f677b11c1b118 100644
--- a/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
+++ b/llvm/test/CodeGen/AArch64/i128_with_overflow.ll
@@ -10,9 +10,7 @@ define i128 @test_uadd_i128(i128 noundef %x, i128 noundef %y) {
 ; CHECK-SD:       // %bb.0: // %entry
 ; CHECK-SD-NEXT:    adds x0, x0, x2
 ; CHECK-SD-NEXT:    adcs x1, x1, x3
-; CHECK-SD-NEXT:    cset w8, hs
-; CHECK-SD-NEXT:    cmp w8, #1
-; CHECK-SD-NEXT:    b.ne .LBB0_2
+; CHECK-SD-NEXT:    b.hs .LBB0_2
 ; CHECK-SD-NEXT:  // %bb.1: // %if.then
 ; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
@@ -66,9 +64,7 @@ define i128 @test_sadd_i128(i128 noundef %x, i128 noundef %y) {
 ; CHECK-SD:       // %bb.0: // %entry
 ; CHECK-SD-NEXT:    adds x0, x0, x2
 ; CHECK-SD-NEXT:    adcs x1, x1, x3
-; CHECK-SD-NEXT:    cset w8, vs
-; CHECK-SD-NEXT:    cmp w8, #1
-; CHECK-SD-NEXT:    b.ne .LBB1_2
+; CHECK-SD-NEXT:    b.vs .LBB1_2
 ; CHECK-SD-NEXT:  // %bb.1: // %if.then
 ; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
@@ -122,9 +118,7 @@ define i128 @test_usub_i128(i128 noundef %x, i128 noundef %y) {
 ; CHECK-SD:       // %bb.0: // %entry
 ; CHECK-SD-NEXT:    subs x0, x0, x2
 ; CHECK-SD-NEXT:    sbcs x1, x1, x3
-; CHECK-SD-NEXT:    cset w8, lo
-; CHECK-SD-NEXT:    cmp w8, #1
-; CHECK-SD-NEXT:    b.ne .LBB2_2
+; CHECK-SD-NEXT:    b.lo .LBB2_2
 ; CHECK-SD-NEXT:  // %bb.1: // %if.then
 ; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16
@@ -178,9 +172,7 @@ define i128 @test_ssub_i128(i128 noundef %x, i128 noundef %y) {
 ; CHECK-SD:       // %bb.0: // %entry
 ; CHECK-SD-NEXT:    subs x0, x0, x2
 ; CHECK-SD-NEXT:    sbcs x1, x1, x3
-; CHECK-SD-NEXT:    cset w8, vs
-; CHECK-SD-NEXT:    cmp w8, #1
-; CHECK-SD-NEXT:    b.ne .LBB3_2
+; CHECK-SD-NEXT:    b.vs .LBB3_2
 ; CHECK-SD-NEXT:  // %bb.1: // %if.then
 ; CHECK-SD-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-SD-NEXT:    .cfi_def_cfa_offset 16

@AZero13
Copy link
Contributor

AZero13 commented May 30, 2025

is csel(0, 1, CC, Cmp) canonicalized or changed to be 1, 0? Can we catch that too?

@davemgreen
Copy link
Collaborator Author

is csel(0, 1, CC, Cmp) canonicalized or changed to be 1, 0? Can we catch that too?

We do quite a lot of canonicalization of the IR and of the target independent setccs - it can be quite hard to test the different variations as they don't often come up. I haven't seen any other combos in real code (that were not from O0), either from the other variants of this or from csel of csel. I did manage to find a function that was helpful for handling some of the variants of cset.

Canonical csel might be a nice idea - it could help remove a few tablegen patterns. When I tried it it got stuck in an infinite loop though.

This folds away subs(csel(1, 0, cc)) from brcond, that can be produced in
certain places from compares that are not already subs (like adc/sbc generated
from i128 add_with_overflow intrinsics).
@davemgreen davemgreen force-pushed the gh-a64-brcondremcsel branch from 6d23238 to ab4aaa8 Compare June 3, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants