-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[VPlan] Retain exit conditions and edges in initial VPlan (NFC). #137709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-vectorizers Author: Florian Hahn (fhahn) ChangesUpdate initial VPlan construction to include exit conditions and edges. For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (#128420). This includes updates in a few places to use Full diff: https://github.com/llvm/llvm-project/pull/137709.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 4684378687ef6..bd043ae5e06e8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9167,6 +9167,7 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
// loop.
static void addCanonicalIVRecipes(VPlan &Plan, Type *IdxTy, bool HasNUW,
DebugLoc DL) {
+ using namespace VPlanPatternMatch;
Value *StartIdx = ConstantInt::get(IdxTy, 0);
auto *StartV = Plan.getOrAddLiveIn(StartIdx);
@@ -9176,7 +9177,16 @@ static void addCanonicalIVRecipes(VPlan &Plan, Type *IdxTy, bool HasNUW,
VPBasicBlock *Header = TopRegion->getEntryBasicBlock();
Header->insert(CanonicalIVPHI, Header->begin());
- VPBuilder Builder(TopRegion->getExitingBasicBlock());
+ VPBasicBlock *LatchVPBB = TopRegion->getExitingBasicBlock();
+ // We are about to replace the branch to exit the region. Remove the original
+ // BranchOnCond, if there is any.
+ // TODO: Move canonical IV and BranchOnCount introduction to initial skeleton
+ // creation.
+ if (!LatchVPBB->empty() &&
+ match(&LatchVPBB->back(), m_BranchOnCond(m_VPValue())))
+ LatchVPBB->getTerminator()->eraseFromParent();
+
+ VPBuilder Builder(LatchVPBB);
// Add a VPInstruction to increment the scalar canonical IV by VF * UF.
auto *CanonicalIVIncrement = Builder.createOverflowingOp(
Instruction::Add, {CanonicalIVPHI, &Plan.getVFxUF()}, {HasNUW, false}, DL,
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index afad73bcd3501..5d81bcdc4a21e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -117,6 +117,7 @@ class VPBlockBase {
Predecessors.erase(Pos);
}
+public:
/// Remove \p Successor from the successors of this block.
void removeSuccessor(VPBlockBase *Successor) {
auto Pos = find(Successors, Successor);
@@ -129,8 +130,6 @@ class VPBlockBase {
void replacePredecessor(VPBlockBase *Old, VPBlockBase *New) {
auto I = find(Predecessors, Old);
assert(I != Predecessors.end());
- assert(Old->getParent() == New->getParent() &&
- "replaced predecessor must have the same parent");
*I = New;
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 58d63933fb724..af198bbf57df6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -112,6 +112,9 @@ VPBasicBlock *PlainCFGBuilder::getOrCreateVPBB(BasicBlock *BB) {
return VPBB;
}
+ if (!TheLoop->contains(BB))
+ return Plan->getExitBlock(BB);
+
// Create new VPBB.
StringRef Name = BB->getName();
LLVM_DEBUG(dbgs() << "Creating VPBasicBlock for " << Name << "\n");
@@ -145,14 +148,6 @@ bool PlainCFGBuilder::isExternalDef(Value *Val) {
// Instruction definition is in outermost loop PH.
return false;
- // Check whether Instruction definition is in a loop exit.
- SmallVector<BasicBlock *> ExitBlocks;
- TheLoop->getExitBlocks(ExitBlocks);
- if (is_contained(ExitBlocks, InstParent)) {
- // Instruction definition is in outermost loop exit.
- return false;
- }
-
// Check whether Instruction definition is in loop body.
return !TheLoop->contains(Inst);
}
@@ -201,11 +196,6 @@ void PlainCFGBuilder::createVPInstructionsForVPBB(VPBasicBlock *VPBB,
"Instruction shouldn't have been visited.");
if (auto *Br = dyn_cast<BranchInst>(Inst)) {
- if (TheLoop->getLoopLatch() == BB ||
- any_of(successors(BB),
- [this](BasicBlock *Succ) { return !TheLoop->contains(Succ); }))
- continue;
-
// Conditional branch instruction are represented using BranchOnCond
// recipes.
if (Br->isConditional()) {
@@ -295,7 +285,6 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
for (BasicBlock *BB : RPO) {
// Create or retrieve the VPBasicBlock for this BB.
VPBasicBlock *VPBB = getOrCreateVPBB(BB);
- Loop *LoopForBB = LI->getLoopFor(BB);
// Set VPBB predecessors in the same order as they are in the incoming BB.
setVPBBPredsFromBB(VPBB, BB);
@@ -326,24 +315,12 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
BasicBlock *IRSucc1 = BI->getSuccessor(1);
VPBasicBlock *Successor0 = getOrCreateVPBB(IRSucc0);
VPBasicBlock *Successor1 = getOrCreateVPBB(IRSucc1);
-
- // Don't connect any blocks outside the current loop except the latches for
- // inner loops.
- // TODO: Also connect exit blocks during initial VPlan construction.
- if (LoopForBB == TheLoop || BB != LoopForBB->getLoopLatch()) {
- if (!LoopForBB->contains(IRSucc0)) {
- VPBB->setOneSuccessor(Successor1);
- continue;
- }
- if (!LoopForBB->contains(IRSucc1)) {
- VPBB->setOneSuccessor(Successor0);
- continue;
- }
- }
-
VPBB->setTwoSuccessors(Successor0, Successor1);
}
+ for (auto *EB : Plan->getExitBlocks())
+ setVPBBPredsFromBB(EB, EB->getIRBasicBlock());
+
// 2. The whole CFG has been built at this point so all the input Values must
// have a VPlan counterpart. Fix VPlan header phi by adding their
// corresponding VPlan operands.
@@ -446,19 +423,21 @@ static void createLoopRegion(VPlan &Plan, VPBlockBase *HeaderVPB) {
VPBlockBase *Succ = LatchVPBB->getSingleSuccessor();
assert(LatchVPBB->getNumSuccessors() <= 1 &&
"Latch has more than one successor");
- if (Succ)
- VPBlockUtils::disconnectBlocks(LatchVPBB, Succ);
+ LatchVPBB->removeSuccessor(Succ);
auto *R = Plan.createVPRegionBlock(HeaderVPB, LatchVPBB, "",
false /*isReplicator*/);
// All VPBB's reachable shallowly from HeaderVPB belong to top level loop,
// because VPlan is expected to end at top level latch disconnected above.
+ SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(),
+ Plan.getExitBlocks().end());
for (VPBlockBase *VPBB : vp_depth_first_shallow(HeaderVPB))
- VPBB->setParent(R);
+ if (!ExitBlocks.contains(VPBB))
+ VPBB->setParent(R);
VPBlockUtils::insertBlockAfter(R, PreheaderVPBB);
- if (Succ)
- VPBlockUtils::connectBlocks(R, Succ);
+ R->setOneSuccessor(Succ);
+ Succ->replacePredecessor(LatchVPBB, R);
}
void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
@@ -476,8 +455,29 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
VPBlockUtils::insertBlockAfter(VecPreheader, Plan.getEntry());
VPBasicBlock *MiddleVPBB = Plan.createVPBasicBlock("middle.block");
- VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB);
- LatchVPB->swapSuccessors();
+ VPBlockBase *LatchExitVPB = LatchVPB->getNumSuccessors() == 2
+ ? LatchVPB->getSuccessors()[0]
+ : nullptr;
+ if (LatchExitVPB) {
+ LatchVPB->getSuccessors()[0] = MiddleVPBB;
+ MiddleVPBB->setPredecessors({LatchVPB});
+ MiddleVPBB->setSuccessors({LatchExitVPB});
+ LatchExitVPB->replacePredecessor(LatchVPB, MiddleVPBB);
+ } else {
+ VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB);
+ LatchVPB->swapSuccessors();
+ }
+
+ // Disconnect all edges between exit blocks other than from the latch.
+ // TODO: Uncountable exit blocks should be handled here.
+ for (VPBlockBase *EB : to_vector(Plan.getExitBlocks())) {
+ for (VPBlockBase *Pred : to_vector(EB->getPredecessors())) {
+ if (Pred == MiddleVPBB)
+ continue;
+ cast<VPBasicBlock>(Pred)->getTerminator()->eraseFromParent();
+ VPBlockUtils::disconnectBlocks(Pred, EB);
+ }
+ }
// Create SCEV and VPValue for the trip count.
// We use the symbolic max backedge-taken-count, which works also when
@@ -503,8 +503,9 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
// Thus if tail is to be folded, we know we don't need to run the
// remainder and we can set the condition to true.
// 3) Otherwise, construct a runtime check.
-
if (!RequiresScalarEpilogueCheck) {
+ if (LatchExitVPB)
+ VPBlockUtils::disconnectBlocks(MiddleVPBB, LatchExitVPB);
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
// The exit blocks are unreachable, remove their recipes to make sure no
// users remain that may pessimize transforms.
@@ -516,9 +517,6 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
}
// The connection order corresponds to the operands of the conditional branch.
- BasicBlock *IRExitBlock = TheLoop->getUniqueLatchExitBlock();
- auto *VPExitBlock = Plan.getExitBlock(IRExitBlock);
- VPBlockUtils::connectBlocks(MiddleVPBB, VPExitBlock);
VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
auto *ScalarLatchTerm = TheLoop->getLoopLatch()->getTerminator();
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll b/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
index 91a5ea6b7fe36..fe845ae74cbee 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
@@ -31,7 +31,11 @@ define void @foo(i64 %n) {
; CHECK-NEXT: outer.latch:
; CHECK-NEXT: EMIT ir<%outer.iv.next> = add ir<%outer.iv>, ir<1>
; CHECK-NEXT: EMIT ir<%outer.ec> = icmp ir<%outer.iv.next>, ir<8>
-; CHECK-NEXT: Successor(s): outer.header
+; CHECK-NEXT: EMIT branch-on-cond ir<%outer.ec>
+; CHECK-NEXT: Successor(s): ir-bb<exit>, outer.header
+; CHECK-EMPTY:
+; CHECK-NEXT: ir-bb<exit>:
+; CHECK-NEXT: No successors
; CHECK-NEXT: }
entry:
br label %outer.header
|
Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on llvm#137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: llvm#128420. Depends on llvm#137709.
Update initial VPlan construction to include exit conditions and edges. For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (llvm#128420). This includes updates in a few places to use replaceSuccessor/replacePredecessor to preserve the order of predecessors and successors, to reduce the need of fixing up phi operand orderings. This unfortunately required making them public, not sure if there's a
Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on llvm#137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: llvm#128420. Depends on llvm#137709.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raises some thoughts... (trying to address the incomplete "not sure if there's a" ending of the commit message)
// Check whether Instruction definition is in a loop exit. | ||
SmallVector<BasicBlock *> ExitBlocks; | ||
TheLoop->getExitBlocks(ExitBlocks); | ||
if (is_contained(ExitBlocks, InstParent)) { | ||
// Instruction definition is in outermost loop exit. | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this simply redundant, independently, as such instructions which reside in ExitBlocks are not contained in TheLoop?
Should the above definition of external defs be updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, removed as NFC in 282af2d
for (VPBlockBase *VPBB : vp_depth_first_shallow(HeaderVPB)) | ||
VPBB->setParent(R); | ||
if (!ExitBlocks.contains(VPBB)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can now reach exit blocks, contrary to above comment, via early exits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, comment updated, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm2, createLoopRegion()
is called after prepareForVectorization()
, which according to the changes below should have removed all early-exit edges, so is this check (if VPBB is an exit block) needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not reachable in the latest version, removed thanks!
|
||
addCanonicalIVRecipes(Plan, cast<VPBasicBlock>(HeaderVPB), | ||
cast<VPBasicBlock>(LatchVPB), InductionTy, IVDL); | ||
|
||
// Disconnect all edges between exit blocks other than from the latch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Disconnect all edges between exit blocks other than from the latch. | |
// Disconnect all edges to exit blocks other than from the middle block. |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
if (!TheLoop->contains(BB)) | ||
return Plan->getExitBlock(BB); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about populating BB2VPBB with Plan's exit blocks on PlainCFGBuilder's construction, to fold this check with the one above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
assert(LatchVPBB->getNumSuccessors() <= 1 && | ||
"Latch has more than one successor"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(LatchVPBB->getNumSuccessors() <= 1 && | |
"Latch has more than one successor"); | |
assert(Succ && "Latch expected to be left with a single successor"); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
for (VPBlockBase *VPBB : vp_depth_first_shallow(HeaderVPB)) | ||
VPBB->setParent(R); | ||
if (!ExitBlocks.contains(VPBB)) | ||
VPBB->setParent(R); | ||
|
||
VPBlockUtils::insertBlockAfter(R, PreheaderVPBB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PlaceHolder
introduced earlier may now be used instead as follows
VPBlockUtils::insertBlockAfter(R, PreheaderVPBB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks
VPBlockBase *LatchExitVPB = LatchVPB->getNumSuccessors() == 2 | ||
? LatchVPB->getSuccessors()[0] | ||
: nullptr; | ||
if (LatchExitVPB) { | ||
LatchVPB->getSuccessors()[0] = MiddleVPBB; | ||
MiddleVPBB->setPredecessors({LatchVPB}); | ||
MiddleVPBB->setSuccessors({LatchExitVPB}); | ||
LatchExitVPB->replacePredecessor(LatchVPB, MiddleVPBB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VPBlockBase *LatchExitVPB = LatchVPB->getNumSuccessors() == 2 | |
? LatchVPB->getSuccessors()[0] | |
: nullptr; | |
if (LatchExitVPB) { | |
LatchVPB->getSuccessors()[0] = MiddleVPBB; | |
MiddleVPBB->setPredecessors({LatchVPB}); | |
MiddleVPBB->setSuccessors({LatchExitVPB}); | |
LatchExitVPB->replacePredecessor(LatchVPB, MiddleVPBB); | |
// Canonical LatchVPB has header block as last successor. If it has another | |
// successor, the latter is an exit block - insert middle block on its edge. | |
// Otherwise, add middle block as another successor retaining header as last. | |
if (LatchVPB->getNumSuccessors() == 2) { | |
VPBlockBase *LatchExitVPB = LatchVPB->getSuccessors()[0]; | |
VPBlockUtils::insertOnEdge(LatchVPB, LatchExitVPB, MiddleVPBB); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
|
||
addCanonicalIVRecipes(Plan, cast<VPBasicBlock>(HeaderVPB), | ||
cast<VPBasicBlock>(LatchVPB), InductionTy, IVDL); | ||
|
||
// Disconnect all edges between exit blocks other than from the latch. | ||
// TODO: Uncountable exit blocks should be handled here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are "handled", being listed in getExitBlocks; but should be handled differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, here for now we just simply drop the edges, i.e. leaving the plan in an incorrect/incomplete state, relying on the later transform to adjust the exit condition and introduce an extra middle block, looking up the early exit condition using original IR references.
if (!RequiresScalarEpilogueCheck) { | ||
if (LatchExitVPB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (LatchExitVPB) | |
if (auto *LatchExitVPB = MiddleVPBB->getSingleSuccessor()) |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thanks!
@@ -117,6 +117,7 @@ class VPBlockBase { | |||
Predecessors.erase(Pos); | |||
} | |||
|
|||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully this low-level API can remain private, possibly with an addition to the high-level public API, see below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
Remove checking if the instruction is in the preheader or exit blocks. Those checks are redundant and handled by checking if the instruction is outside the loop below. Split off as suggested from #137709.
Flip the region construction order to innermost first from outermost first. This ensures we only set the final parent region for VPBBs once. Split off from #137709.
Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on llvm#137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: llvm#128420. Depends on llvm#137709.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message needs updating (and completion..)
TODO: Suspect mergeBlocksIntoPredecessors()
might lose predecessor positions as it doesn't check if the empty block's successors have a single predecessor (though that may be the case currently; otherwise should probably not be considered atm). Supplying removeOnEdge(Block)
which disconnects Block from its single Pred and single Succ, reconnecting them together while retaining pred and succ positions may help, but an empty block being merged into its predecessor may have multiple successors - rather than being on an edge.
any_of(successors(BB), | ||
[this](BasicBlock *Succ) { return !TheLoop->contains(Succ); })) | ||
continue; | ||
|
||
// Conditional branch instruction are represented using BranchOnCond | ||
// recipes. | ||
if (Br->isConditional()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: seems a bit clearer to early continue?
if (Br->isConditional()) { | |
if (!Br->isConditional()) | |
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, ignore, we continue also after handling a conditional branch.
// All VPBB's reachable shallowly from HeaderVPB belong to top level loop, | ||
// because VPlan is expected to end at top level latch disconnected above. | ||
// All VPBB's reachable shallowly from HeaderVPB belong to the current region, | ||
// except the exit blocks reachable via non-latch exiting blocks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// except the exit blocks reachable via non-latch exiting blocks, | |
// except the exit blocks reachable via non-latch exiting blocks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed thanks!
VPBlockUtils::insertBlockAfter(R, PreheaderVPBB); | ||
if (Succ) | ||
VPBlockUtils::connectBlocks(R, Succ); | ||
VPBlockUtils::insertOnEdge(PlaceHolder, Succ, R); | ||
|
||
// Remove placeholder block. | ||
VPBlockUtils::disconnectBlocks(R, PlaceHolder); | ||
VPBlockUtils::disconnectBlocks(PlaceHolder, R); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inserting R first after Preheader (clearer to instead explicitly insert R on the edge from Preheader to PlaceHolder?) and then inserting R again on the edge between PlaceHolder and Succ seems a bit confusing, thereby creating bidirectional edges between R and PlaceHolder which are then removed. It's like inserting R both before PlaceHolder and after it, but both these insertions require R to be disconnected. In essence we want to replace PlaceHolder with R.
Another alternative is to have an empty region as a placeholder rather than an empty basic block, and then move/copy R's entry and exit blocks into it:
auto *PlaceHolder = Plan.createVPRegionBlock("", false /*isReplicator*/);
followed by
PlaceHolder.setEntry(R.getEntry());
PlaceHolder.setExit(R.getExit());
essentially turning R
to be the temporary block and PlaceHolder
to be R
, constructed outside in rather than inside out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps suffice to be a bit clearer by slightly reordering (would Exit
be a better name for Succ
?):
// Have R replace PlaceHolder as successor of Preheader.
BlockUtils::insertOnEdge(PreheaderBlock, PlaceHolder, R);
BlockUtils::disconnectBlocks(R, PlaceHolder);
// Have R replace PlaceHolder as predecessor of Exit.
BlockUtils::insertOnEdge(PlaceHolder, Exit, R);
BlockUtils::disconnectBlocks(PlaceHolder, R);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there's no need to construct a placeholder region, we can simply create an empty main region first and then set entry/exiting after adjusting the CFG. Updated, thanks!
// TODO: VPlans with early exits should be explicitly converted to a form only | ||
// exiting via the latch here, including adjusting the exit condition, instead | ||
// of simplify disconnecting the edges and adjusting the VPlan later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: VPlans with early exits should be explicitly converted to a form only | |
// exiting via the latch here, including adjusting the exit condition, instead | |
// of simplify disconnecting the edges and adjusting the VPlan later. | |
// TODO: VPlans with early exits should be explicitly converted to a form | |
// exiting only via the latch here, including adjusting the exit condition, | |
// instead of simply disconnecting the edges and adjusting the VPlan later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thanks!
…rnalDef (NFC). Remove checking if the instruction is in the preheader or exit blocks. Those checks are redundant and handled by checking if the instruction is outside the loop below. Split off as suggested from llvm/llvm-project#137709.
Flip the region construction order to innermost first from outermost first. This ensures we only set the final parent region for VPBBs once. Split off from llvm/llvm-project#137709.
Thanks, updated now.
Now there's no need to remove/merge, so I left the TODO out of the commit message for now. It will mangle the predecessor order of the successors, at the moment not used in places where successors could have phi-like recipes I think |
Hi @fhahn, looks like there might be some genuine unit test failures on the Windows build? |
VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB); | ||
LatchVPB->swapSuccessors(); | ||
// Canonical LatchVPB has header block as last successor. If it has another | ||
// successor, the latter is an exit block - insert middle block on its edge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm a bit confused by the word "latter" here. Does "latter" refer to "another successor"? Perhaps it's easier just to say that if the latchvpb is not canonical the early exit block(s) come first, with the (canonical?) exit to the middle block being last? If I've understood the layout correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LatchVPB is canonical - as a result of calling canonicalHeaderAndLatch() above. Being a canonical latch means LatchVPB has header block as its last successor, and this property is maintained. If LatchVPB has another successor, in addition to header, this other successor (appears first and) is an exit block. In this case middle block is inserted on the edge from LatchVPB to its first exit block successor. Should "the latter" be replaced with "this other successor"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to say
If it has another successor, this successor is an exit block
if (Pred == MiddleVPBB) | ||
continue; | ||
cast<VPBasicBlock>(Pred)->getTerminator()->eraseFromParent(); | ||
VPBlockUtils::disconnectBlocks(Pred, EB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit odd at first as it seems to be undoing what you did above with VPBlockUtils::insertOnEdge(LatchVPB, LatchExitVPB, MiddleVPBB);
. I presume that's because the vector.early.exit
VPBB sits between the latch block and the original IR early exit block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The edge from MiddleVPBB to LatchExitVPB is retained here explicitly via the early-continue exclusion. The edges removed here are early-exits from non-latch Pred block to early.exit
block. Block vector.early.exit
is introduced by handleUncountableEarlyExit()
which currently takes place later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above with insertOnEdge
we only handle the edge exiting from the latch. The exit block via the latch will now be connected to the middle block.
The loop here disconnects all early exits and they will be handled later: either by requiring at least one scalar iteration, nothing more needs to be done, or introducing the early exit control flow to go to the early exit via the additional middle block. For the latter case, the VPlan is now incomplete/incorrect.
To avoid this, we should directly handle the uncountable early exits here, which is done in #138393. This way, we do not need to rely on IR references in handleUncountableEarlyExit
and the VPlan remains complete/correct throughout.
// order of blocks. Set region entry and exiting after both HeaderVPB and | ||
// LatchVPBB have been disconnected from their predecessors/successors. | ||
auto *R = Plan.createVPRegionBlock("", false /*isReplicator*/); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB); | ||
LatchVPB->swapSuccessors(); | ||
// Canonical LatchVPB has header block as last successor. If it has another | ||
// successor, the latter is an exit block - insert middle block on its edge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LatchVPB is canonical - as a result of calling canonicalHeaderAndLatch() above. Being a canonical latch means LatchVPB has header block as its last successor, and this property is maintained. If LatchVPB has another successor, in addition to header, this other successor (appears first and) is an exit block. In this case middle block is inserted on the edge from LatchVPB to its first exit block successor. Should "the latter" be replaced with "this other successor"?
A few C++ unit tests still needed updating, should be done now |
Remove checking if the instruction is in the preheader or exit blocks. Those checks are redundant and handled by checking if the instruction is outside the loop below. Split off as suggested from llvm#137709.
Flip the region construction order to innermost first from outermost first. This ensures we only set the final parent region for VPBBs once. Split off from llvm#137709.
// The canonical LatchVPB has the header block as last successor. If it has | ||
// another successor, this successor is an exit block - insert middle block on | ||
// its edge. Otherwise, add middle block as another successor retaining header | ||
// as last. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, prepareForVectorization()
is called after buildPlainCFG()
(for both native and non-native paths), and with the above changes, buildPlainCFG()
connects the latch to both header and exit blocks, i.e., should LatchVPB always have two successors now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can vectorize loops where the latch isn’t exiting, by requiring at least one iteration in the scalar loop. In that case, the original latch will
Have a single successor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, thanks. Perhaps worth a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine w/ me, adding some last final comments, thanks!
@david-arm - is this ok w/ you?
@@ -534,9 +547,6 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy, | |||
} | |||
|
|||
// The connection order corresponds to the operands of the conditional branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"// with middle block already connected to exit block."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added thanks
@@ -302,6 +304,7 @@ compound=true | |||
" EMIT store ir\<%res\>, ir\<%arr.idx\>\l" + | |||
" EMIT ir\<%iv.next\> = add ir\<%iv\>, ir\<1\>\l" + | |||
" EMIT ir\<%exitcond\> = icmp ir\<%iv.next\>, ir\<%N\>\l" + | |||
" EMIT vp\<%3\> = not ir\<%exitcond\>\l" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should vp%3 be used, also above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, at this point we already added the canonical branch to control the loop; the old conditions will be cleaned up by DCE
// The canonical LatchVPB has the header block as last successor. If it has | ||
// another successor, this successor is an exit block - insert middle block on | ||
// its edge. Otherwise, add middle block as another successor retaining header | ||
// as last. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right, thanks. Perhaps worth a comment.
for (VPBlockBase *VPBB : vp_depth_first_shallow(HeaderVPB)) | ||
VPBB->setParent(R); | ||
if (!ExitBlocks.contains(VPBB)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm2, createLoopRegion()
is called after prepareForVectorization()
, which according to the changes below should have removed all early-exit edges, so is this check (if VPBB is an exit block) needed?
@@ -229,6 +230,7 @@ TEST_F(VPlanHCFGTest, testVPInstructionToVPRecipesInner) { | |||
EXPECT_NE(nullptr, dyn_cast<VPInstruction>(&*Iter++)); | |||
EXPECT_NE(nullptr, dyn_cast<VPInstruction>(&*Iter++)); | |||
EXPECT_NE(nullptr, dyn_cast<VPInstruction>(&*Iter++)); | |||
EXPECT_NE(nullptr, dyn_cast<VPInstruction>(&*Iter++)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous case didn't check all instructions (that Iter reached VecBB->end()), so avoid checking the new last instruction, as done here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left as is for now, just carrying the curent behavior forward as this checks that we reach VecBB->end(). Might be something to be done separately? Or checking if we have the expected canonical instructions, which may make the test more brittle.
SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(), | ||
Plan.getExitBlocks().end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(), | |
Plan.getExitBlocks().end()); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed thanks
SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(), | ||
Plan.getExitBlocks().end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(), | |
Plan.getExitBlocks().end()); |
?
…(NFC). (#137709) Update initial VPlan construction to include exit conditions and edges. The loop region is now first constructed without entry/exiting. Those are set after inserting the region in the CFG, to preserve the original predecessor/successor order of blocks. For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (llvm/llvm-project#128420). PR: llvm/llvm-project#137709
Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on llvm#137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: llvm#128420. Depends on llvm#137709.
Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on #137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: #128420. Depends on #137709 (included in PR). PR: #138393
…138393) Move early-exit handling up front to original VPlan construction, before introducing early exits. This builds on llvm/llvm-project#137709, which adds exiting edges to the original VPlan, instead of adding exit blocks later. This retains the exit conditions early, and means we can handle early exits before forming regions, without the reliance on VPRecipeBuilder. Once we retain all exits initially, handling early exits before region construction ensures the regions are valid; otherwise we would leave edges exiting the region from elsewhere than the latch. Removing the reliance on VPRecipeBuilder removes the dependence on mapping IR BBs to VPBBs and unblocks predication as VPlan transform: llvm/llvm-project#128420. Depends on llvm/llvm-project#137709 (included in PR). PR: llvm/llvm-project#138393
Update initial VPlan construction to include exit conditions and edges.
The loop region is now first constructed without entry/exiting. Those are set after inserting the region in the CFG, to preserve the original predecessor/successor order of blocks.
For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (#128420).