-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AArch64] Change how we do bit computations for bfi #142646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AZero13
wants to merge
2
commits into
llvm:main
Choose a base branch
from
AZero13:bxfil
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@llvm/pr-subscribers-backend-aarch64 Author: AZero13 (AZero13) ChangesInstead of relying on known 0s, we can instead focus on what we do know about the final result, and bxfil that. Full diff: https://github.com/llvm/llvm-project/pull/142646.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 34f6db9374cb5..b52f2c03f5503 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -3316,28 +3316,50 @@ static bool tryBitfieldInsertOpFromOrAndImm(SDNode *N, SelectionDAG *CurDAG) {
!isOpcWithIntImmediate(And.getNode(), ISD::AND, MaskImm))
return false;
- // Compute the Known Zero for the AND as this allows us to catch more general
- // cases than just looking for AND with imm.
- KnownBits Known = CurDAG->computeKnownBits(And);
+ // In the event that computeKnownBits gives us more bits than we bargained
+ // for, and cause the the result to not be a shifted mask, we
+ // should just focus on the maskImm
- // Non-zero in the sense that they're not provably zero, which is the key
- // point if we want to use this value.
- uint64_t NotKnownZero = (~Known.Zero).getZExtValue();
+ // We can optimize any chain of bitwise operations that results in a series of
+ // contiguous bits being set if we know the end result will have those bits
+ // set.
- // The KnownZero mask must be a shifted mask (e.g., 1110..011, 11100..00).
- if (!isShiftedMask(Known.Zero.getZExtValue(), VT))
- return false;
+ KnownBits Known = CurDAG->computeKnownBits(And.getOperand(0));
+ uint64_t OldOne = Known.One.getZExtValue();
+ uint64_t OldZero = Known.Zero.getZExtValue();
- // The bits being inserted must only set those bits that are known to be zero.
- if ((OrImm & NotKnownZero) != 0) {
- // FIXME: It's okay if the OrImm sets NotKnownZero bits to 1, but we don't
- // currently handle this case.
- return false;
+ Known &= KnownBits::makeConstant(APInt(BitWidth, MaskImm));
+ Known |= KnownBits::makeConstant(APInt(BitWidth, OrImm));
+
+ // Now we need to see if any KnownBits changed
+ uint64_t NewOneVal = Known.One.getZExtValue();
+ uint64_t NewZeroVal = Known.Zero.getZExtValue();
+
+ uint64_t ChangedBits = (NewOneVal ^ OldOne) | (NewZeroVal ^ OldZero);
+
+ // Now ensure that all changed bits are still in known territory.
+ // 6. Ensure all changed bits are still in known territory.
+
+ // 7. Find smallest bitfield insert that encompasses all changed bits and any
+ // known bits between them.
+ if (ChangedBits == 0) {
+ // No change? Then we can just replace the node with the 1st AND operand.
+ CurDAG->ReplaceAllUsesWith(SDValue(N, 0), And->getOperand(0));
+ return true;
}
- // BFI/BFXIL dst, src, #lsb, #width.
- int LSB = llvm::countr_one(NotKnownZero);
- int Width = BitWidth - APInt(BitWidth, NotKnownZero).popcount();
+ // 4) Find your span:
+ APInt ChangedBitsAPInt = APInt(BitWidth, ChangedBits);
+ int LSB = ChangedBitsAPInt.countr_zero();
+ int MSB = BitWidth - ChangedBitsAPInt.countl_zero();
+ int Width = MSB - LSB;
+ uint64_t SpanMask = ((1ULL << Width) - 1) << LSB;
+
+ // 5) Bail if *any* bit in that run is still unknown:
+ uint64_t Unknown = ~(NewOneVal | NewZeroVal);
+ if ((Unknown & SpanMask) != 0)
+ return false;
+
// BFI/BFXIL is an alias of BFM, so translate to BFM operands.
unsigned ImmR = (BitWidth - LSB) % BitWidth;
@@ -3348,7 +3370,7 @@ static bool tryBitfieldInsertOpFromOrAndImm(SDNode *N, SelectionDAG *CurDAG) {
// ORR. A BFXIL will use the same constant as the original ORR, so the code
// should be no worse in this case.
bool IsBFI = LSB != 0;
- uint64_t BFIImm = OrImm >> LSB;
+ uint64_t BFIImm = (NewOneVal & SpanMask) >> LSB;
if (IsBFI && !AArch64_AM::isLogicalImmediate(BFIImm, BitWidth)) {
// We have a BFI instruction and we know the constant can't be materialized
// with a ORR-immediate with the zero register.
diff --git a/llvm/test/CodeGen/AArch64/bitfield-insert.ll b/llvm/test/CodeGen/AArch64/bitfield-insert.ll
index eefb862c5313c..556c7ecde792f 100644
--- a/llvm/test/CodeGen/AArch64/bitfield-insert.ll
+++ b/llvm/test/CodeGen/AArch64/bitfield-insert.ll
@@ -753,3 +753,32 @@ entry:
store i16 %or, ptr %p
ret i16 %and1
}
+
+define i32 @test1_known_not_zero(i32 %a) {
+; CHECK-LABEL: test1_known_not_zero:
+; CHECK: // %bb.0:
+; CHECK-NEXT: mov w8, #12288 // =0x3000
+; CHECK-NEXT: and w9, w0, #0xfffffffe
+; CHECK-NEXT: movk w8, #7, lsl #16
+; CHECK-NEXT: and w9, w9, #0xfff00fff
+; CHECK-NEXT: orr w0, w9, w8
+; CHECK-NEXT: ret
+ %and1 = and i32 %a, -1044482
+ %or = or disjoint i32 %and1, 471040
+ ret i32 %or
+}
+
+define i32 @hole_test(i32 %a) {
+; CHECK-LABEL: hole_test:
+; CHECK: // %bb.0:
+; CHECK-NEXT: mul w9, w0, w0
+; CHECK-NEXT: mov w8, #65533 // =0xfffd
+; CHECK-NEXT: movk w8, #65525, lsl #16
+; CHECK-NEXT: orr w9, w9, #0xff0000
+; CHECK-NEXT: and w0, w9, w8
+; CHECK-NEXT: ret
+ %2 = mul i32 %a, %a
+ %3 = and i32 %2, -4128771
+ %4 = or i32 %3, 16056320
+ ret i32 %4
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Instead of relying on known 0s, we can instead focus on what we do know about the final result, and bxfil that.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Instead of relying on known 0s, we can instead focus on what we do know about the final result, and bxfil that.