Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LV] Drop incorrect inbounds for reverse vector pointer when folding tail #120730

Conversation

fmayer
Copy link
Contributor

@fmayer fmayer commented Dec 20, 2024

When folding the tail, we may compute an address that we don't in the
original scalar loop and it may not be inbounds. Drop Inbounds in that
case.

Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2024

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Mayer (fmayer)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/120730.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+6-4)
  • (modified) llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll (+2-2)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index cda90d70e5c8da..0768eccc8aeb35 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1986,10 +1986,12 @@ void VPReverseVectorPointerRecipe::execute(VPTransformState &State) {
   // LastLane = 1 - RunTimeVF
   Value *LastLane = Builder.CreateSub(ConstantInt::get(IndexTy, 1), RunTimeVF);
   Value *Ptr = State.get(getOperand(0), VPLane(0));
-  Value *ResultPtr =
-      Builder.CreateGEP(IndexedTy, Ptr, NumElt, "", getGEPNoWrapFlags());
-  ResultPtr = Builder.CreateGEP(IndexedTy, ResultPtr, LastLane, "",
-                                getGEPNoWrapFlags());
+  Value *ResultPtr = Builder.CreateGEP(
+      IndexedTy, Ptr, NumElt, "",
+      getGEPNoWrapFlags().withoutInBounds().withoutNoUnsignedSignedWrap());
+  ResultPtr = Builder.CreateGEP(
+      IndexedTy, ResultPtr, LastLane, "",
+      getGEPNoWrapFlags().withoutInBounds().withoutNoUnsignedSignedWrap());
 
   State.set(this, ResultPtr, /*IsScalar*/ true);
 }
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll b/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
index 99605971298fe9..e982bc5b702a17 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-noinbounds-gep.ll
@@ -27,8 +27,8 @@ define i1 @fn() local_unnamed_addr #0 {
 ; CHECK-NEXT:    [[TMP2:%.*]] = and <4 x i64> [[VEC_IND]], splat (i64 1)
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp eq <4 x i64> [[TMP2]], zeroinitializer
 ; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw [12 x i32], ptr [[NNO]], i64 0, i64 [[TMP0]]
-; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i32 0
-; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[TMP5]], i32 -3
+; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i32, ptr [[TMP4]], i32 0
+; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr i32, ptr [[TMP5]], i32 -3
 ; CHECK-NEXT:    [[REVERSE:%.*]] = shufflevector <4 x i1> [[TMP1]], <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
 ; CHECK-NEXT:    [[WIDE_MASKED_LOAD:%.*]] = call <4 x i32> @llvm.masked.load.v4i32.p0(ptr [[TMP6]], i32 4, <4 x i1> [[REVERSE]], <4 x i32> poison)
 ; CHECK-NEXT:    [[REVERSE1:%.*]] = shufflevector <4 x i32> [[WIDE_MASKED_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>

Created using spr 1.3.4
Created using spr 1.3.4
@fmayer fmayer requested a review from nikic December 20, 2024 13:40
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, this is only an issue, if we are folding the tail, as then we may compute an address that we don't in the original scalar loop, so may not be inbounds? If we don't fold the tail, it should be fine to keep the inbounds flag?

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp Outdated Show resolved Hide resolved
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4
@fmayer
Copy link
Contributor Author

fmayer commented Dec 20, 2024

To clarify, this is only an issue, if we are folding the tail, as then we may compute an address that we don't in the original scalar loop, so may not be inbounds? If we don't fold the tail, it should be fine to keep the inbounds flag?

that sounds correct. regardless, i would prefer to submit a relatively trivial fix like this (nothing bad should happen from removing wrap flags), and take it from there afterwards. wdyt?

@fmayer fmayer requested a review from fhahn December 20, 2024 14:15
Copy link

github-actions bot commented Dec 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
fmayer added 5 commits January 3, 2025 04:01
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@fmayer fmayer changed the base branch from users/fmayer/spr/main.vectorizer-fix-geps-incorrectly-marked-as-inbounds to main January 3, 2025 14:43
Comment on lines 8346 to 8349
if (Reverse)
// N.B. we deliberately do pass getGEPNoWrapFlags here, because this
// transform can invalidate `inbounds`.
VectorPtr = new VPReverseVectorPointerRecipe(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there are quite a number of unnecessary test changes now that all tests have been updated, I think it would be better to try and do the precise check straight-away, to avoid unneeded churn

I think the check should be something like below?

Suggested change
if (Reverse)
// N.B. we deliberately do pass getGEPNoWrapFlags here, because this
// transform can invalidate `inbounds`.
VectorPtr = new VPReverseVectorPointerRecipe(
if (Reverse) {
// When folding the tail, we may compute an address that we don't in the
// original scalar loop and it may not be inbounds. Drop Inbounds in that
// case.
GEPNoWrapFlags Flags =
(CM.foldTailByMasking() || !GEP || !GEP->isInBounds())
? GEPNoWrapFlags::none()
: GEPNoWrapFlags::inBounds();
VectorPtr = new VPReverseVectorPointerRecipe(
Ptr, &Plan.getVF(), getLoadStoreType(I), Flags, I->getDebugLoc());
} else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sgtm. i had the same thought when i uploaded the diffs but then was gone for the holidays.

Created using spr 1.3.4
@fmayer fmayer requested a review from fhahn January 6, 2025 14:43
fmayer added 2 commits January 6, 2025 06:46
Created using spr 1.3.4
Created using spr 1.3.4
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

Might be worth adjusting the title using [LV] instead of [Vectorizer] (which is more ambiguous) and maybe adding a bit more detail, something like

[LV] Drop inbounds for reverse vector pointers when folding tail.

@fmayer fmayer changed the title [Vectorizer] fix GEPs incorrectly marked as "inbounds" [LV] Drop incorrect inbounds for reverse vector pointer when folding tail Jan 7, 2025
@fmayer fmayer merged commit ef391db into main Jan 7, 2025
8 checks passed
@fmayer fmayer deleted the users/fmayer/spr/vectorizer-fix-geps-incorrectly-marked-as-inbounds branch January 7, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants