-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV] Drop incorrect inbounds for reverse vector pointer when folding tail #120730
[LV] Drop incorrect inbounds for reverse vector pointer when folding tail #120730
Conversation
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-transforms Author: Florian Mayer (fmayer) ChangesFull diff: https://github.com/llvm/llvm-project/pull/120730.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index cda90d70e5c8da..0768eccc8aeb35 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1986,10 +1986,12 @@ void VPReverseVectorPointerRecipe::execute(VPTransformState &State) {
// LastLane = 1 - RunTimeVF
Value *LastLane = Builder.CreateSub(ConstantInt::get(IndexTy, 1), RunTimeVF);
Value *Ptr = State.get(getOperand(0), VPLane(0));
- Value *ResultPtr =
- Builder.CreateGEP(IndexedTy, Ptr, NumElt, "", getGEPNoWrapFlags());
- ResultPtr = Builder.CreateGEP(IndexedTy, ResultPtr, LastLane, "",
- getGEPNoWrapFlags());
+ Value *ResultPtr = Builder.CreateGEP(
+ IndexedTy, Ptr, NumElt, "",
+ getGEPNoWrapFlags().withoutInBounds().withoutNoUnsignedSignedWrap());
+ ResultPtr = Builder.CreateGEP(
+ IndexedTy, ResultPtr, LastLane, "",
+ getGEPNoWrapFlags().withoutInBounds().withoutNoUnsignedSignedWrap());
State.set(this, ResultPtr, /*IsScalar*/ true);
}
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll b/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
index 99605971298fe9..e982bc5b702a17 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
@@ -27,8 +27,8 @@ define i1 @fn() local_unnamed_addr #0 {
; CHECK-NEXT: [[TMP2:%.*]] = and <4 x i64> [[VEC_IND]], splat (i64 1)
; CHECK-NEXT: [[TMP3:%.*]] = icmp eq <4 x i64> [[TMP2]], zeroinitializer
; CHECK-NEXT: [[TMP4:%.*]] = getelementptr inbounds nuw [12 x i32], ptr [[NNO]], i64 0, i64 [[TMP0]]
-; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0
-; CHECK-NEXT: [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TMP5]], i32 -3
+; CHECK-NEXT: [[TMP5:%.*]] = getelementptr i32, ptr [[TMP4]], i32 0
+; CHECK-NEXT: [[TMP6:%.*]] = getelementptr i32, ptr [[TMP5]], i32 -3
; CHECK-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i1> [[TMP1]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
; CHECK-NEXT: [[WIDE_MASKED_LOAD:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0(ptr [[TMP6]], i32 4, <4 x i1> [[REVERSE]], <4 x i32> poison)
; CHECK-NEXT: [[REVERSE1:%.*]] = shufflevector <4 x i32> [[WIDE_MASKED_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
|
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, this is only an issue, if we are folding the tail, as then we may compute an address that we don't in the original scalar loop, so may not be inbounds? If we don't fold the tail, it should be fine to keep the inbounds
flag?
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
that sounds correct. regardless, i would prefer to submit a relatively trivial fix like this (nothing bad should happen from removing wrap flags), and take it from there afterwards. wdyt? |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Created using spr 1.3.4 [skip ci]
… enabled Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4 [skip ci]
if (Reverse) | ||
// N.B. we deliberately do pass getGEPNoWrapFlags here, because this | ||
// transform can invalidate `inbounds`. | ||
VectorPtr = new VPReverseVectorPointerRecipe( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that there are quite a number of unnecessary test changes now that all tests have been updated, I think it would be better to try and do the precise check straight-away, to avoid unneeded churn
I think the check should be something like below?
if (Reverse) | |
// N.B. we deliberately do pass getGEPNoWrapFlags here, because this | |
// transform can invalidate `inbounds`. | |
VectorPtr = new VPReverseVectorPointerRecipe( | |
if (Reverse) { | |
// When folding the tail, we may compute an address that we don't in the | |
// original scalar loop and it may not be inbounds. Drop Inbounds in that | |
// case. | |
GEPNoWrapFlags Flags = | |
(CM.foldTailByMasking() || !GEP || !GEP->isInBounds()) | |
? GEPNoWrapFlags::none() | |
: GEPNoWrapFlags::inBounds(); | |
VectorPtr = new VPReverseVectorPointerRecipe( | |
Ptr, &Plan.getVF(), getLoadStoreType(I), Flags, I->getDebugLoc()); | |
} else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sgtm. i had the same thought when i uploaded the diffs but then was gone for the holidays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Might be worth adjusting the title using [LV] instead of [Vectorizer] (which is more ambiguous) and maybe adding a bit more detail, something like
[LV] Drop inbounds for reverse vector pointers when folding tail.
When folding the tail, we may compute an address that we don't in the
original scalar loop and it may not be inbounds. Drop Inbounds in that
case.