Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable #67812

Merged
merged 29 commits into from
Dec 12, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
6bdc81f
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing
Mel-Chen Sep 25, 2023
bcab2a6
Refactor the identification method for FindLastIV pattern.
Mel-Chen Sep 25, 2023
dfa355b
Replace SelectInst with auto
Mel-Chen Oct 4, 2023
734da15
Drop parentheses
Mel-Chen Oct 4, 2023
2c896c4
Modified lambda to capture all external variables by reference
Mel-Chen Oct 4, 2023
1946c8c
Fix format
Mel-Chen Oct 4, 2023
e8d5b1d
Clean up comments
Mel-Chen Oct 4, 2023
d2bfe2f
Refine the expression of valid range
Mel-Chen Oct 6, 2023
4a88b44
Fix the warning caused by unused parameter
Mel-Chen Nov 27, 2023
d21b127
Fix typo
Mel-Chen Nov 27, 2023
2fe0a94
Clean comment in test case
Mel-Chen Nov 27, 2023
7ebc7d8
Fix SE pass
Mel-Chen Apr 24, 2024
3e2e9f1
Add TODO comment for the decreasing induction
Mel-Chen Apr 24, 2024
11900cd
Refine comments
Mel-Chen May 6, 2024
2924bf9
Refine debug dump
Mel-Chen May 6, 2024
dd21cd3
Refine comments
Mel-Chen May 10, 2024
76e91cc
Revert "Refine the expression of valid range"
Mel-Chen May 10, 2024
4f743ab
Refine the expression of valid range
Mel-Chen May 10, 2024
556743a
Capture LHS and RHS of select instruction by match.
Mel-Chen May 10, 2024
a852eb6
Rebase, and update test cases
Mel-Chen Oct 17, 2024
4947e0f
Restrict FindLastIV idiom to single-use reduction phi.
Mel-Chen Oct 17, 2024
6e739d8
Remove unused variable
Mel-Chen Oct 17, 2024
7d2a8ec
Refine pattern matcher
Mel-Chen Oct 31, 2024
e530a3a
Refine lit test checker format
Mel-Chen Oct 31, 2024
e50439e
Remove unused CHECK
Mel-Chen Nov 28, 2024
ab0b4d3
Refine the comment for getSentinelValue
Mel-Chen Dec 2, 2024
97e0433
Replace if-condition with assert
Mel-Chen Dec 3, 2024
69cd172
Add comment for wrap check
Mel-Chen Dec 3, 2024
5ebec5d
Add check for ensure the loop of SCEVAddRec is the same as the loop o…
Mel-Chen Dec 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing
integer induction variable

Consider the following loop:

  int rdx = init;
  for (int i = 0; i < n; ++i)
    rdx = (a[i] > b[i]) ? i : rdx;

We can vectorize this loop if `i` is an increasing induction variable.
The final reduced value will be the maximum of `i` that the condition
`a[i] > b[i]` is satisfied, or the start value `init`.

This patch added new RecurKind enums - IFindLastIV and FFindLastIV.
  • Loading branch information
Mel-Chen committed Dec 12, 2024
commit 6bdc81fb4a377528c79d705dbd020c5674cd15cd
39 changes: 36 additions & 3 deletions llvm/include/llvm/Analysis/IVDescriptors.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,16 @@ enum class RecurKind {
FMulAdd, ///< Sum of float products with llvm.fmuladd(a * b + sum).
IAnyOf, ///< Any_of reduction with select(icmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
FAnyOf ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
FAnyOf, ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
// TODO: Any_of reduction need not be restricted to integer type only.
IFindLastIV, ///< FindLast reduction with select(icmp(),x,y) where one of
///< (x,y) is increasing loop induction PHI, and both x and y are
///< integer type.
FFindLastIV ///< FindLast reduction with select(fcmp(),x,y) where one of (x,y)
///< is increasing loop induction PHI, and both x and y are
///< integer type.
// TODO: Any_of and FindLast reduction need not be restricted to integer type
// only.
};

/// The RecurrenceDescriptor is used to identify recurrences variables in a
Expand Down Expand Up @@ -124,7 +131,7 @@ class RecurrenceDescriptor {
/// the returned struct.
static InstDesc isRecurrenceInstr(Loop *L, PHINode *Phi, Instruction *I,
RecurKind Kind, InstDesc &Prev,
FastMathFlags FuncFMF);
FastMathFlags FuncFMF, ScalarEvolution *SE);

/// Returns true if instruction I has multiple uses in Insts
static bool hasMultipleUsesOf(Instruction *I,
Expand All @@ -151,6 +158,16 @@ class RecurrenceDescriptor {
static InstDesc isAnyOfPattern(Loop *Loop, PHINode *OrigPhi, Instruction *I,
InstDesc &Prev);

/// Returns a struct describing whether the instruction is either a
/// Select(ICmp(A, B), X, Y), or
/// Select(FCmp(A, B), X, Y)
/// where one of (X, Y) is an increasing loop induction variable, and the
/// other is a PHI value.
// TODO: Support non-monotonic variable. FindLast does not need be restricted
// to increasing loop induction variables.
static InstDesc isFindLastIVPattern(PHINode *OrigPhi, Instruction *I,
ScalarEvolution &SE);

/// Returns a struct describing if the instruction is a
/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
static InstDesc isConditionalRdxPattern(RecurKind Kind, Instruction *I);
Expand Down Expand Up @@ -236,10 +253,26 @@ class RecurrenceDescriptor {
return Kind == RecurKind::IAnyOf || Kind == RecurKind::FAnyOf;
}

/// Returns true if the recurrence kind is of the form
/// select(cmp(),x,y) where one of (x,y) is increasing loop induction.
static bool isFindLastIVRecurrenceKind(RecurKind Kind) {
return Kind == RecurKind::IFindLastIV || Kind == RecurKind::FFindLastIV;
}

/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }

/// Returns the sentinel value used to replace the start value.
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
Value *getSentinelValue() const {
if (isFindLastIVRecurrenceKind(Kind)) {
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
Type *Ty = StartValue->getType();
return ConstantInt::get(
Ty, APInt::getSignedMinValue(Ty->getIntegerBitWidth()));
}
return nullptr;
}

/// Returns a reference to the instructions used for type-promoting the
/// recurrence.
const SmallPtrSet<Instruction *, 8> &getCastInsts() const { return CastInsts; }
Expand Down
6 changes: 6 additions & 0 deletions llvm/include/llvm/Transforms/Utils/LoopUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -419,6 +419,12 @@ Value *createAnyOfReduction(IRBuilderBase &B, Value *Src,
const RecurrenceDescriptor &Desc,
PHINode *OrigPhi);

/// Create a reduction of the given vector \p Src for a reduction of the
/// kind RecurKind::IFindLastIV or RecurKind::FFindLastIV. The reduction
/// operation is described by \p Desc.
Value *createFindLastIVReduction(IRBuilderBase &B, Value *Src,
const RecurrenceDescriptor &Desc);

/// Create a generic reduction using a recurrence descriptor \p Desc
/// Fast-math-flags are propagated using the RecurrenceDescriptor.
Value *createReduction(IRBuilderBase &B, const RecurrenceDescriptor &Desc,
Expand Down
101 changes: 96 additions & 5 deletions llvm/lib/Analysis/IVDescriptors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
case RecurKind::UMin:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
case RecurKind::IFindLastIV:
case RecurKind::FFindLastIV:
return true;
}
return false;
Expand Down Expand Up @@ -372,7 +374,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// type-promoted).
if (Cur != Start) {
ReduxDesc =
isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF);
isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
ExactFPMathInst = ExactFPMathInst == nullptr
? ReduxDesc.getExactFPMathInst()
: ExactFPMathInst;
Expand Down Expand Up @@ -658,6 +660,87 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
: RecurKind::FAnyOf);
}

// We are looking for loops that do something like this:
// int r = 0;
// for (int i = 0; i < n; i++) {
// if (src[i] > 3)
// r = i;
// }
// The reduction value (r) is derived from either the values of an increasing
// induction variable (i) sequence, or from the start value (0).
// The LLVM IR generated for such loops would be as follows:
// for.body:
// %r = phi i32 [ %spec.select, %for.body ], [ 0, %entry ]
// %i = phi i32 [ %inc, %for.body ], [ 0, %entry ]
// ...
// %cmp = icmp sgt i32 %5, 3
// %spec.select = select i1 %cmp, i32 %i, i32 %r
// %inc = add nsw i32 %i, 1
// ...
// Since 'i' is an increasing induction variable, the reduction value after the
// loop will be the maximum value of 'i' that the condition (src[i] > 3) is
// satisfied, or the start value (0 in the example above). When the start value
// of the increasing induction variable 'i' is greater than the minimum value of
// the data type, we can use the minimum value of the data type as a sentinel
// value to replace the start value. This allows us to perform a single
// reduction max operation to obtain the final reduction result.
// TODO: It is possible to solve the case where the start value is the minimum
// value of the data type or a non-constant value by using mask and multiple
// reduction operations.
RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isFindLastIVPattern(Loop *Loop, PHINode *OrigPhi,
Instruction *I, ScalarEvolution *SE) {
// Only match select with single use cmp condition.
// TODO: Only handle single use for now.
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
CmpInst::Predicate Pred;
if (!match(I, m_Select(m_OneUse(m_Cmp(Pred, m_Value(), m_Value())), m_Value(),
m_Value())))
return InstDesc(false, I);

SelectInst *SI = cast<SelectInst>(I);
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
Value *NonRdxPhi = nullptr;

if (OrigPhi == dyn_cast<PHINode>(SI->getTrueValue()))
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
NonRdxPhi = SI->getFalseValue();
else if (OrigPhi == dyn_cast<PHINode>(SI->getFalseValue()))
NonRdxPhi = SI->getTrueValue();
else
return InstDesc(false, I);

auto IsIncreasingLoopInduction = [&SE, &Loop](Value *V) {
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
auto *Phi = dyn_cast<PHINode>(V);
if (!Phi)
return false;

if (!SE)
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
return false;

InductionDescriptor ID;
if (!InductionDescriptor::isInductionPHI(Phi, Loop, SE, ID))
return false;

const SCEVAddRecExpr *AR = cast<SCEVAddRecExpr>(SE->getSCEV(Phi));
if (!AR->hasNoSignedWrap())
return false;

ConstantInt *IVStartValue = dyn_cast<ConstantInt>(ID.getStartValue());
if (!IVStartValue || IVStartValue->isMinSignedValue())
return false;

const SCEV *Step = ID.getStep();
return SE->isKnownPositive(Step);
};

// We are looking for selects of the form:
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
// select(cmp(), phi, loop_induction) or
// select(cmp(), loop_induction, phi)
if (!IsIncreasingLoopInduction(NonRdxPhi))
return InstDesc(false, I);

return InstDesc(I, isa<ICmpInst>(I->getOperand(0)) ? RecurKind::IFindLastIV
: RecurKind::FFindLastIV);
}

RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isMinMaxPattern(Instruction *I, RecurKind Kind,
const InstDesc &Prev) {
Expand Down Expand Up @@ -756,10 +839,9 @@ RecurrenceDescriptor::isConditionalRdxPattern(RecurKind Kind, Instruction *I) {
return InstDesc(true, SI);
}

RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
Instruction *I, RecurKind Kind,
InstDesc &Prev, FastMathFlags FuncFMF) {
RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
Loop *L, PHINode *OrigPhi, Instruction *I, RecurKind Kind, InstDesc &Prev,
FastMathFlags FuncFMF, ScalarEvolution *SE) {
assert(Prev.getRecKind() == RecurKind::None || Prev.getRecKind() == Kind);
switch (I->getOpcode()) {
default:
Expand Down Expand Up @@ -789,6 +871,8 @@ RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
if (Kind == RecurKind::FAdd || Kind == RecurKind::FMul ||
Kind == RecurKind::Add || Kind == RecurKind::Mul)
return isConditionalRdxPattern(Kind, I);
if (isFindLastIVRecurrenceKind(Kind))
return isFindLastIVPattern(L, OrigPhi, I, SE);
[[fallthrough]];
case Instruction::FCmp:
case Instruction::ICmp:
Expand Down Expand Up @@ -893,6 +977,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop,
<< *Phi << "\n");
return true;
}
if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not also need one for FFindLastIV?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. We don't need to call AddReductionVar again for FFindLastIV.
This is because at the end of function isFindLastIVPattern, IFindLastIV can be transformed into FFindLastIV if the predicate is an fcmp instruction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AnyOf recurrence kinds do the same thing, although there is a separate check for FAnyOf below which wouldn't be reached if IAnyOf matched it first, and is just redundant work if there wasn't a match.

We would only see that when looking at iv-descriptors debug output, and there's no tests for that. This code at least reports the correct type for an fcmp.

I figure it's fine with just the single case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like #118393 is attempting to improve this issue.

DT, SE)) {
LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to indicate whether we've matched IFindLastIV or FFindLastIV because FindLastIV doesn't tell you which one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9a0bf28
Sure, I agree that explicitly indicating IFindLast or FFindLast is better.
The new modification directly determines whether to print IFindLast or FFindLast based on the final result. This way, there is no need to call AddReductionVar twice, and the correct result can be indicated.
If you feel that detecting IFindLast and FFindLast separately would be clearer, please let me know.

return true;
}
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
Expand Down Expand Up @@ -1048,12 +1137,14 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
case RecurKind::UMax:
case RecurKind::UMin:
case RecurKind::IAnyOf:
case RecurKind::IFindLastIV:
return Instruction::ICmp;
case RecurKind::FMax:
case RecurKind::FMin:
case RecurKind::FMaximum:
case RecurKind::FMinimum:
case RecurKind::FAnyOf:
case RecurKind::FFindLastIV:
return Instruction::FCmp;
default:
llvm_unreachable("Unknown recurrence operation");
Expand Down
19 changes: 19 additions & 0 deletions llvm/lib/Transforms/Utils/LoopUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1208,6 +1208,23 @@ Value *llvm::createAnyOfReduction(IRBuilderBase &Builder, Value *Src,
return Builder.CreateSelect(AnyOf, NewVal, InitVal, "rdx.select");
}

Value *llvm::createFindLastIVReduction(IRBuilderBase &Builder, Value *Src,
const RecurrenceDescriptor &Desc) {
assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
Desc.getRecurrenceKind()) &&
"Unexpected reduction kind");
Value *StartVal = Desc.getRecurrenceStartValue();
Value *Sentinel = Desc.getSentinelValue();
Value *MaxRdx = Src->getType()->isVectorTy()
? Builder.CreateIntMaxReduce(Src, true)
: Src;
// Correct the final reduction result back to the start value if the maximum
// reduction is sentinel value.
Value *Cmp =
Builder.CreateCmp(CmpInst::ICMP_NE, MaxRdx, Sentinel, "rdx.select.cmp");
return Builder.CreateSelect(Cmp, MaxRdx, StartVal, "rdx.select");
}

Value *llvm::getReductionIdentity(Intrinsic::ID RdxID, Type *Ty,
FastMathFlags Flags) {
bool Negative = false;
Expand Down Expand Up @@ -1315,6 +1332,8 @@ Value *llvm::createReduction(IRBuilderBase &B,
RecurKind RK = Desc.getRecurrenceKind();
if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
return createAnyOfReduction(B, Src, Desc, OrigPhi);
if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
return createFindLastIVReduction(B, Src, Desc);

return createSimpleReduction(B, Src, RK);
}
Expand Down
11 changes: 7 additions & 4 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5185,8 +5185,9 @@ LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
HasReductions &&
any_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {
const RecurrenceDescriptor &RdxDesc = Reduction.second;
return RecurrenceDescriptor::isAnyOfRecurrenceKind(
RdxDesc.getRecurrenceKind());
RecurKind RK = RdxDesc.getRecurrenceKind();
return RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs test without forced interleave count ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to confirm whether you're trying to test this message?

   if (HasSelectCmpReductions) {
      LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
      return 1;

or if you simply want to see test results without -force-vector-interleave?

});
if (HasSelectCmpReductions) {
LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
Expand Down Expand Up @@ -9449,8 +9450,10 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(

const RecurrenceDescriptor &RdxDesc = PhiR->getRecurrenceDescriptor();
RecurKind Kind = RdxDesc.getRecurrenceKind();
assert(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
"AnyOf reductions are not allowed for in-loop reductions");
assert(
(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
!RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind)) &&
"AnyOf and FindLast reductions are not allowed for in-loop reductions");
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved

// Collect the chain of "link" recipes for the reduction starting at PhiR.
SetVector<VPSingleDefRecipe *> Worklist;
Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20451,6 +20451,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
case RecurKind::IFindLastIV:
case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for repeated scalar.");
}
Expand Down Expand Up @@ -20548,6 +20550,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
case RecurKind::IFindLastIV:
case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for reused scalars.");
}
Expand Down
20 changes: 19 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -567,6 +567,9 @@ Value *VPInstruction::generate(VPTransformState &State) {
if (Op != Instruction::ICmp && Op != Instruction::FCmp)
ReducedPartRdx = Builder.CreateBinOp(
(Instruction::BinaryOps)Op, RdxPart, ReducedPartRdx, "bin.rdx");
else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
ReducedPartRdx =
createMinMaxOp(Builder, RecurKind::SMax, ReducedPartRdx, RdxPart);
else
ReducedPartRdx = createMinMaxOp(Builder, RK, ReducedPartRdx, RdxPart);
}
Expand All @@ -575,7 +578,8 @@ Value *VPInstruction::generate(VPTransformState &State) {
// Create the reduction after the loop. Note that inloop reductions create
// the target reduction in the loop using a Reduction recipe.
if ((State.VF.isVector() ||
RecurrenceDescriptor::isAnyOfRecurrenceKind(RK)) &&
RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) &&
!PhiR->isInLoop()) {
ReducedPartRdx =
createReduction(Builder, RdxDesc, ReducedPartRdx, OrigPhi);
Expand Down Expand Up @@ -3398,6 +3402,20 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
Builder.SetInsertPoint(VectorPH->getTerminator());
StartV = Iden = State.get(StartVPV);
}
} else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK)) {
// [I|F]FindLastIV will use a sentinel value to initialize the reduction
// phi. In the exit block, ComputeReductionResult will generate checks to
// verify if the reduction result is the sentinel value. If the result is
// the sentinel value, it will be corrected back to the start value.
// TODO: The sentinel value is not always necessary. When the start value is
// a constant, and smaller than the start value of the induction variable,
// the start value can be directly used to initialize the reduction phi.
StartV = Iden = RdxDesc.getSentinelValue();
if (!ScalarPHI) {
IRBuilderBase::InsertPointGuard IPBuilder(Builder);
Builder.SetInsertPoint(VectorPH->getTerminator());
StartV = Iden = Builder.CreateVectorSplat(State.VF, Iden);
}
} else {
Iden = llvm::getRecurrenceIdentity(RK, VecTy->getScalarType(),
RdxDesc.getFastMathFlags());
Expand Down
6 changes: 3 additions & 3 deletions llvm/test/Transforms/LoopVectorize/iv-select-cmp.ll
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
; RUN: opt -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=1 -S < %s | FileCheck %s --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK-VF4IC1 --check-prefix=CHECK
Mel-Chen marked this conversation as resolved.
Show resolved Hide resolved
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK-VF4IC4 --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=1 -S < %s | FileCheck %s --check-prefix=CHECK-VF1IC4 --check-prefix=CHECK

define i64 @select_icmp_const_1(ptr %a, i64 %n) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both @select_icmp_const_1 and @select_icmp_const_2 look similar to test select_icmp_nuw_nsw in Transforms/LoopVectorize/iv-select-cmp-no-wrap.ll.

Also, I see the only difference between @select_icmp_const_1 and @select_icmp_const_2 is the operands to the select are swapped. I'm not sure having both versions really adds much value. Perhaps you can remove both of them and leave the one in iv-select-cmp-no-wrap.ll

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are indeed quite similar. iv-select-cmp.ll is for testing IR generation, including the case of UF > 1. iv-select-cmp-no-wrap.ll is for testing whether vectorization is legal.
My suggestion is to remove select_icmp_nuw_nsw from iv-select-cmp-no-wrap.ll, and retain @select_icmp_const_1 and @select_icmp_const_2.
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david-arm Ping.
Regarding my test file proposal, what do you think?

; CHECK-LABEL: define i64 @select_icmp_const_1(
Expand Down
6 changes: 3 additions & 3 deletions llvm/test/Transforms/LoopVectorize/select-min-index.ll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just delete this file, since it looks practically the same as llvm/test/Transforms/LoopVectorize/iv-select-cmp-no-wrap.ll and doesn't seem to offer any extra value?

Copy link
Contributor Author

@Mel-Chen Mel-Chen May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we must retain llvm/test/Transforms/LoopVectorize/select-min-index.ll. select-min-index.ll demonstrates another semantics: minmax with index. This semantics is similar to FindLastIV but different.
BTW, this patch is separated from the minmax with index patch.

int min = start_min;
int idx = start_idx;
for (int i = 0; i < n; i++)
  if (min > a[i]) {
    min = a[i];
    idx = i;
  }

// live-out: idx

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S %s | FileCheck %s
; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=2 -S %s | FileCheck %s
; RUN: opt -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=2 -S %s | FileCheck %s
; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S %s | FileCheck %s --check-prefix=CHECK-VF4IC1 --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=2 -S %s | FileCheck %s --check-prefix=CHECK-VF4IC2 --check-prefix=CHECK
; RUN: opt -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=2 -S %s | FileCheck %s --check-prefix=CHECK-VF1IC2 --check-prefix=CHECK

; Test cases for selecting the index with the minimum value.

Expand Down