-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[SCEVDivision] Prevent propagation of incorrect no-wrap flags #154745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-llvm-analysis Author: Ryotaro Kasuga (kasuga-fj) ChangesIn Fix #152566 Full diff: https://github.com/llvm/llvm-project/pull/154745.diff 4 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolutionDivision.cpp b/llvm/lib/Analysis/ScalarEvolutionDivision.cpp
index d03930d9e2d99..c374d328c984a 100644
--- a/llvm/lib/Analysis/ScalarEvolutionDivision.cpp
+++ b/llvm/lib/Analysis/ScalarEvolutionDivision.cpp
@@ -141,10 +141,26 @@ void SCEVDivision::visitAddRecExpr(const SCEVAddRecExpr *Numerator) {
if (Ty != StartQ->getType() || Ty != StartR->getType() ||
Ty != StepQ->getType() || Ty != StepR->getType())
return cannotDivide(Numerator);
+
+ // Infer no-wrap flags for Remainder.
+ // TODO: Catch more cases.
+ SCEV::NoWrapFlags NumFlags = Numerator->getNoWrapFlags();
+ SCEV::NoWrapFlags RemFlags = SCEV::NoWrapFlags::FlagAnyWrap;
+ const SCEV *StepNumAbs =
+ SE.getAbsExpr(Numerator->getStepRecurrence(SE), /*IsNSW=*/false);
+ const SCEV *StepRAbs = SE.getAbsExpr(StepR, /*IsNSW=*/false);
+ const Loop *L = Numerator->getLoop();
+
+ // If abs(StepR) <=u abs(StepNumAbs) and both are loop invariant, propagate
+ // the <NW> from Numerator to Remainder.
+ if (ScalarEvolution::hasFlags(NumFlags, SCEV::NoWrapFlags::FlagNW) &&
+ SE.isLoopInvariant(StepNumAbs, L) && SE.isLoopInvariant(StepRAbs, L) &&
+ SE.isKnownPredicate(ICmpInst::ICMP_ULE, StepRAbs, StepNumAbs))
+ RemFlags = ScalarEvolution::setFlags(RemFlags, SCEV::NoWrapFlags::FlagNW);
+
Quotient = SE.getAddRecExpr(StartQ, StepQ, Numerator->getLoop(),
- Numerator->getNoWrapFlags());
- Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(),
- Numerator->getNoWrapFlags());
+ SCEV::NoWrapFlags::FlagAnyWrap);
+ Remainder = SE.getAddRecExpr(StartR, StepR, Numerator->getLoop(), RemFlags);
}
void SCEVDivision::visitAddExpr(const SCEVAddExpr *Numerator) {
diff --git a/llvm/test/Analysis/Delinearization/fixed_size_array.ll b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
index 0512044990163..e77fd79a49b35 100644
--- a/llvm/test/Analysis/Delinearization/fixed_size_array.ll
+++ b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
@@ -163,7 +163,7 @@ exit:
; CHECK: Delinearization on function a_i_2j1_k:
; CHECK: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><%for.j.header>][{32,+,1}<nw><%for.k>]
+; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<%for.j.header>][{32,+,1}<%for.k>]
define void @a_i_2j1_k(ptr %a) {
entry:
br label %for.i.header
diff --git a/llvm/test/Analysis/Delinearization/wraps.ll b/llvm/test/Analysis/Delinearization/wraps.ll
new file mode 100644
index 0000000000000..fc4935bad9939
--- /dev/null
+++ b/llvm/test/Analysis/Delinearization/wraps.ll
@@ -0,0 +1,130 @@
+; RUN: opt < %s -passes='print<delinearization>' -disable-output 2>&1 | FileCheck %s
+
+; In the following case, we don't know the concret value of `m`, so we cannot
+; deduce no-wrap behavior for the quotient/remainder divided by `m`. However,
+; we can infer `{0,+,1}<%loop>` is nuw and nsw from the induction variable.
+;
+; for (int i = 0; i < btc; i++)
+; a[i * (m + 42)] = 0;
+
+; CHECK: AccessFunction: {0,+,(42 + %m)}<nuw><nsw><%loop>
+; CHECK-NEXT: Base offset: %a
+; CHECK-NEXT: ArrayDecl[UnknownSize][%m] with elements of 1 bytes.
+; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%loop>][{0,+,42}<%loop>]
+define void @divide_by_m0(ptr %a, i64 %m, i64 %btc) {
+entry:
+ %stride = add nsw nuw i64 %m, 42
+ br label %loop
+
+loop:
+ %i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
+ %offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
+ %idx = getelementptr inbounds i8, ptr %a, i64 %offset
+ store i8 0, ptr %idx
+ %i.next = add nsw nuw i64 %i, 1
+ %offset.next = add nsw nuw i64 %offset, %stride
+ %cond = icmp eq i64 %i.next, %btc
+ br i1 %cond, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; In the following case, we don't know the concret value of `m`, so we cannot
+; deduce no-wrap behavior for the quotient/remainder divided by `m`. Also, we
+; don't infer nsw/nuw from the induction variable in this case.
+;
+; for (int i = 0; i < btc; i++)
+; a[i * (2 * m + 42)] = 0;
+
+; CHECK: AccessFunction: {0,+,(42 + (2 * %m))}<nuw><nsw><%loop>
+; CHECK-NEXT: Base offset: %a
+; CHECK-NEXT: ArrayDecl[UnknownSize][%m] with elements of 1 bytes.
+; CHECK-NEXT: ArrayRef[{0,+,2}<%loop>][{0,+,42}<%loop>]
+define void @divide_by_m2(ptr %a, i64 %m, i64 %btc) {
+entry:
+ %m2 = add nsw nuw i64 %m, %m
+ %stride = add nsw nuw i64 %m2, 42
+ br label %loop
+
+loop:
+ %i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
+ %offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
+ %idx = getelementptr inbounds i8, ptr %a, i64 %offset
+ store i8 0, ptr %idx
+ %i.next = add nsw nuw i64 %i, 1
+ %offset.next = add nsw nuw i64 %offset, %stride
+ %cond = icmp eq i64 %i.next, %btc
+ br i1 %cond, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; In the following case, the `i * 2 * d` is always zero, so it's nsw and nuw.
+; However, the quotient divided by `d` is neither nsw nor nuw.
+;
+; if (d == 0)
+; for (unsigned long long i = 0; i != UINT64_MAX; i++)
+; a[i * 2 * d] = 42;
+
+; CHECK: AccessFunction: {0,+,(2 * %d)}<nuw><nsw><%loop>
+; CHECK-NEXT: Base offset: %a
+; CHECK-NEXT: ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
+; CHECK-NEXT: ArrayRef[{0,+,2}<%loop>][0]
+define void @divide_by_zero(ptr %a, i64 %d) {
+entry:
+ %guard = icmp eq i64 %d, 0
+ br i1 %guard, label %loop.preheader, label %exit
+
+loop.preheader:
+ %stride = mul nsw nuw i64 %d, 2 ; since %d is 0, %stride is also 0
+ br label %loop
+
+loop:
+ %i = phi i64 [ 0, %loop.preheader ], [ %i.next, %loop ]
+ %offset = phi i64 [ 0, %loop.preheader ], [ %offset.next, %loop ]
+ %idx = getelementptr inbounds i8, ptr %a, i64 %offset
+ store i8 42, ptr %idx
+ %i.next = add nuw i64 %i, 1
+ %offset.next = add nsw nuw i64 %offset, %stride
+ %cond = icmp eq i64 %i.next, -1
+ br i1 %cond, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; In the following case, the `i * (d + 1)` is always zero, so it's nsw and nuw.
+; However, the quotient/remainder divided by `d` is not nsw.
+;
+; if (d == UINT64_MAX)
+; for (unsigned long long i = 0; i != d; i++)
+; a[i * (d + 1)] = 42;
+
+; CHECK: AccessFunction: {0,+,(1 + %d)}<nuw><nsw><%loop>
+; CHECK-NEXT: Base offset: %a
+; CHECK-NEXT: ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
+; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><%loop>][{0,+,1}<nuw><%loop>]
+define void @divide_by_minus_one(ptr %a, i64 %d) {
+entry:
+ %guard = icmp eq i64 %d, -1
+ br i1 %guard, label %loop.preheader, label %exit
+
+loop.preheader:
+ %stride = add nsw i64 %d, 1 ; since %d is -1, %stride is 0
+ br label %loop
+
+loop:
+ %i = phi i64 [ 0, %loop.preheader ], [ %i.next, %loop ]
+ %offset = phi i64 [ 0, %loop.preheader ], [ %offset.next, %loop ]
+ %idx = getelementptr inbounds i8, ptr %a, i64 %offset
+ store i8 42, ptr %idx
+ %i.next = add nuw i64 %i, 1
+ %offset.next = add nsw nuw i64 %offset, %stride
+ %cond = icmp eq i64 %i.next, %d
+ br i1 %cond, label %exit, label %loop
+
+exit:
+ ret void
+}
diff --git a/llvm/test/Analysis/DependenceAnalysis/DADelin.ll b/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
index 8f94a455d3724..f670136aed750 100644
--- a/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
@@ -479,14 +479,16 @@ for.cond.cleanup: ; preds = %for.cond.cleanup3,
;; for (int k = 1; k < o; k++)
;; = A[i*m*o + j*o + k]
;; A[i*m*o + j*o + k - 1] =
+;;
+;; FIXME: Currently fails to infer nsw for the SCEV `{0,+,1}<for.body8>`
define void @t8(i32 %n, i32 %m, i32 %o, ptr nocapture %A) {
; CHECK-LABEL: 't8'
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: %0 = load i32, ptr %arrayidx, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4
-; CHECK-NEXT: da analyze - consistent anti [0 0 1]!
+; CHECK-NEXT: da analyze - anti [* * *|<]!
; CHECK-NEXT: Src: store i32 %add12, ptr %arrayidx2, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4
-; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: da analyze - output [* * *]!
;
entry:
%cmp49 = icmp sgt i32 %n, 0
|
| ;; | ||
| ;; FIXME: Currently fails to infer nsw for the SCEV `{0,+,1}<for.body8>` | ||
| define void @t8(i32 %n, i32 %m, i32 %o, ptr nocapture %A) { | ||
| ; CHECK-LABEL: 't8' | ||
| ; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: %0 = load i32, ptr %arrayidx, align 4 | ||
| ; CHECK-NEXT: da analyze - none! | ||
| ; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4 | ||
| ; CHECK-NEXT: da analyze - consistent anti [0 0 1]! | ||
| ; CHECK-NEXT: da analyze - anti [* * *|<]! | ||
| ; CHECK-NEXT: Src: store i32 %add12, ptr %arrayidx2, align 4 --> Dst: store i32 %add12, ptr %arrayidx2, align 4 | ||
| ; CHECK-NEXT: da analyze - none! | ||
| ; CHECK-NEXT: da analyze - output [* * *]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The essential portion of the IR is as follows:
preheader:
%guard = icmp sgt i32 %o, 0
br i1 %guard, label %loop, label %exit
loop:
%i = phi i32 [ 1, %preheader ], [ %inc, %loop ]
...
%inc = add nuw nsw i32 %i, 1
%exitcond = icmp eq i32 %inc, %o
br i1 %exitcond, label %exit, label %body
exit:
...From the loop-guard and the induction variable, we know the following:
0 <s %o{1,+,1}<%loop> <s %o{1,+,1}<%loop>is nsw and nuw
IIUIC, in principle, we can deduce that {0,+,1}<%loop> is also nsw and nuw, but now the inference fails. Are there any good ways to address this case, or do we need to implement a separate inference specifically for it?
Was this example supposed to be I'm trying to understand why any of this code is correct at all. Is the invariant here that it gives you an expression of type For the nowrap flags, I think you can justify preserving |
Nevermind, I just realized that this code is performing signed division, not unsigned. |
Sorry, the description is somewhat wrong... I'm reconsidering about it.
Yes, I think. Maybe it's a bit different from the common integer division.
(I'm probably wrong about this, but) Even in unsigned division, what happens |
SCEV does not guarantee that |
It turned out to be correct after all. This case hasn’t occurred only because delinearization doesn’t make such a call right now. After making some changes to the delinearization code, I was able to reproduce it. The test I added is a special case of it, where
Hm, it seem somewhat useful when we just want to do something like symbolic execution, but it’s also quite misleading... |
| const SCEV *StepRAbs = SE.getAbsExpr(StepR, /*IsNSW=*/false); | ||
| const Loop *L = Numerator->getLoop(); | ||
|
|
||
| // If abs(StepR) <=u abs(StepNumAbs) and both are loop invariant, propagate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // If abs(StepR) <=u abs(StepNumAbs) and both are loop invariant, propagate | |
| // If abs(StepR) <=u abs(StepNum) and both are loop invariant, propagate |
StepNumAbs already is the absolute
Add reasoning here? Such as "since the denominator cannot be zero, so abs(Denom) >= 1, the range of the SCEVAddRec can only shrink, i.e. if the range was not exceeding the size of the integer type's domain (i.e. not self-wrap) before, it will not self-wrap after division"
I think the deduction is more general, only needs that the nominator is NW and the denominator is loop-invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments.
I think the deduction is more general, only needs that the nominator is NW and the denominator is loop-invariant.
This seems correct, but I realized that there are no checks to ensure that Denominator is non-zero in the first place. Anyway, it looks like I should fix the other parts first, so I’ll start with those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Division by zero must either be ruled-out by the caller, or the code that it is processing has undefined behaviour. In either case, there is not a lot SCEVDivision can do when representing a division-by-zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently used in Delinearization and does not correspond to actual div instructions in LLVM IR. That is, given a multiplication like %m * %n, an invocation like divide(%m * %n, %m) can happen without a non-zero check. Anyway, in such a case the division seems ill-defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something we need to be careful about is that SCEV flags are global, not use-site specific. So even if we know that division by zero is not possible in context, we may not be able to assume it for the purpose of flag propagation. (I haven't checked whether this is a problem for the delinearization case or not.)
I've been thinking it over for a while, and I’m starting to feel like this comment is reasonable. @nikic Just to confirm, do you think |
|
@kasuga-fj I'm not entirely sure. The only user of SCEVDivision is Delinearization -- maybe it has some additional checks or assumptions that make it safe? But I suspect that it's broken. From a quick test, requiring mul nsw in SCEVDivision breaks lots of tests. But it also looks like many places in Delinearization fail to preserve nowrap flags on multiplies, so maybe some of this can be recovered?
|
|
@nikic Thanks for the comments!
Yes, the only user of SCEVDivision is Delinearization, and the primary user of Delinearization is DA, which doesn't pay attention to nuw/nsw flags (and therefore I'm really struggling now...). So, it's just my personal impression, but I don't think there are such additional checks or assumptions, and I agree that it's likely broken. Even if Delinearization has such checks, considering that SCEVDivision is exposed and it could potentially be used in other passes, it's concerning that the required preconditions are not clearly specified.
I was able to reproduce the test failures. I tried preserving the flags for the In conclusion, I think it would be better to implement something like Let me mark this PR as a draft for now. (totally unrelated, but I just realized that Alive2 is an absolutely amazing tool.) |
This sounds like a good idea in any case. |
…155832) This patch introduces `SCEVDivisionPrinterPass` and registers it under the name `print<scev-division>`, primarily for testing purposes. This pass invokes `SCEVDivision::divide` upon encountering `sdiv`, and prints the numerator, denominator, quotient, and remainder. It also adds several test cases, some of which are currently incorrect and require fixing. Along with that, this patch added some comments to clarify the behavior of `SCEVDivision::divide`, as follows: - This function does NOT actually perform the division - Given the `Numerator` and `Denominator`, find a pair `(Quotient, Remainder)` s.t. `Numerator = Quotient * Denominator + Remainder` - The common condition `Remainder < Denominator` is NOT necessarily required - There may be multiple solutions for `(Quotient, Remainder)`, and this function finds one of them - Especially, there is always a trivial solution `(0, Numerator)` - The following computations may wrap - The multiplication of `Quotient` and `Denominator` - The addition of `Quotient * Denominator` and `Remainder` Related discussion: #154745
In
SCEVDivision, when the numerator isSCEVAddRecExpr, its no-wrap flags were propagated to the quotient and remainder. In general, it is incorrect. For example, consider dividing{0,+,(%m + %n)}<nuw><nsw><%loop>by%m. The quotient would be{0,+,1}<%loop>and the remainder would be{0,+,%n}<%loop>. If%mand%nhave opposite signs, propagating the no-wrap flags from the numerator may be invalid.This patch prevents the incorrect propagation of no-wrap flags in such cases and introduces a small inference for the
<NW>flags in the remainder, primarily to avoid changing existing test results.Fix #152566