-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPlan] Hook IR blocks into VPlan during skeleton creation (NFC) #114292
Changes from 14 commits
1b89761
4e5c743
b87cf14
4ddc87a
599e690
1a77b55
a2137b4
b4d0eac
2b4c71f
be2c3a6
466b393
10a675e
7996451
5ea6f7b
64909aa
382380a
333536a
e5b8af3
df6894a
927a66d
5468f61
3eef601
1d6db4a
22eeebe
886bcc2
65ac2d7
2f7d530
ba08c2e
4c76bec
43c9186
7f08758
3709f17
a72df24
16a6246
87f2815
b7b43a8
906603f
1e7cac7
4a51d29
f53cf1b
a0af583
968598b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -2429,6 +2429,30 @@ InnerLoopVectorizer::getOrCreateVectorTripCount(BasicBlock *InsertBlock) { | |||||||||||
return VectorTripCount; | ||||||||||||
} | ||||||||||||
|
||||||||||||
/// Helper to connect both the vector and scalar preheaders to the Plan's | ||||||||||||
/// entry. This is used when adjusting \p Plan during skeleton | ||||||||||||
/// creation, i.e. adjusting the plan after introducing an initial runtime | ||||||||||||
/// check. | ||||||||||||
static void connectScalarPreheaderInVPlan(VPlan &Plan) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Placed here rather than in VPlan.h/VPBlockUtils expecting it to be a temporary local scaffold? Worth documenting? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, also added a comment, thanks! |
||||||||||||
VPBlockBase *VectorPH = Plan.getVectorPreheader(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dropped, thanks! |
||||||||||||
VPBlockBase *ScalarPH = Plan.getScalarPreheader(); | ||||||||||||
VPBlockBase *PredVPB = Plan.getEntry(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks |
||||||||||||
VPBlockUtils::connectBlocks(PredVPB, ScalarPH, -1, 0); | ||||||||||||
VPBlockUtils::connectBlocks(PredVPB, VectorPH, 0, -1); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. VPlan starts with createInitialVPlan() producing a disconnected Entry and a predecessor-less vector_PH --> vector_region (with Header --> Latch) --> middle_block --> scalar_PH, plus an optional middle-block --> exit. Would it be better to connect Entry-->vector_PH at the outset (ah, disregard - this indeed takes place below...), and here connect Entry-->scalar_PH followed by Furthermore, TCCheckBlock could then be inserted on this Entry-->vector_PH edge, so that connectScalarPreheaderInVPlan() assimilates introduceCheckBlockInVPlan(). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated to connect + swap, thanks.
Not sure how exactly, left as is for now. |
||||||||||||
} | ||||||||||||
|
||||||||||||
/// Introduces a new VPIRBasicBlock for \p CheckIRBB to \p Plan between the | ||||||||||||
/// vector preheader and its predecessor, also connecting to the scalar | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, thanks |
||||||||||||
/// preheader. | ||||||||||||
static void introduceCheckBlockInVPlan(VPlan &Plan, BasicBlock *CheckIRBB) { | ||||||||||||
VPBlockBase *ScalarPH = Plan.getScalarPreheader(); | ||||||||||||
VPBlockBase *VectorPH = Plan.getVectorPreheader(); | ||||||||||||
VPBlockBase *PreVectorPH = VectorPH->getSinglePredecessor(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ScalarPH is expected to be the other successor of PreVectorPH, so could be asserted (or another way to retrieve ScalarPH), although more general w/o this assert? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added an assert for now, than |
||||||||||||
VPIRBasicBlock *CheckVPIRBB = VPIRBasicBlock::fromBasicBlock(CheckIRBB); | ||||||||||||
VPBlockUtils::connectBlocks(CheckVPIRBB, ScalarPH); | ||||||||||||
VPBlockUtils::insertOnEdge(PreVectorPH, VectorPH, CheckVPIRBB); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
or first insert on edge, so that check block has vector loop preheader as its only successor, similar to the initial Entry block, and then connect it also to scalar loop preheader, followed by swapping the successors, aligning with the above. |
||||||||||||
} | ||||||||||||
|
||||||||||||
void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Independent): LoopScalarPreHeader is passed as parameter |
||||||||||||
Value *Count = getTripCount(); | ||||||||||||
// Reuse existing vector loop preheader for TC checks. | ||||||||||||
|
@@ -2513,14 +2537,15 @@ void InnerLoopVectorizer::emitIterationCountCheck(BasicBlock *Bypass) { | |||||||||||
DT->getNode(Bypass)->getIDom()) && | ||||||||||||
"TC check is expected to dominate Bypass"); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This assert was probably added along with the lines below, but should remain, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is to check the DT updates from SplitBlock I think |
||||||||||||
|
||||||||||||
// Update dominator for Bypass & LoopExit (if needed). | ||||||||||||
DT->changeImmediateDominator(Bypass, TCCheckBlock); | ||||||||||||
BranchInst &BI = | ||||||||||||
*BranchInst::Create(Bypass, LoopVectorPreHeader, CheckMinIters); | ||||||||||||
if (hasBranchWeightMD(*OrigLoop->getLoopLatch()->getTerminator())) | ||||||||||||
setBranchWeights(BI, MinItersBypassWeights, /*IsExpected=*/false); | ||||||||||||
ReplaceInstWithInst(TCCheckBlock->getTerminator(), &BI); | ||||||||||||
LoopBypassBlocks.push_back(TCCheckBlock); | ||||||||||||
|
||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Scalar preheader can be connected in VPlan from the outset, rather than here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are a few places that assume the scalar PH has a single predecessor, which would need to be updated. Could pull this in here or adjust as follow-up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The single predecessor of Scalar PH in VPlan being middle_block, right? Applying this additional connection earlier, is fine as a later follow-up, perhaps w/ a TODO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, renamed and added TODO to connectScalarAsBypassLoopInVPlan |
||||||||||||
// TODO: Wrap LoopVectorPreHeader in VPIRBasicBlock here. | ||||||||||||
connectScalarPreheaderInVPlan(Plan); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth wrapping LoopVectorPreHeader in a VPIRBB via connectCheckBlockInVPlan() as well, perhaps as a TODO? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a TODO for now, thanks! |
||||||||||||
} | ||||||||||||
|
||||||||||||
BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) { | ||||||||||||
|
@@ -2537,6 +2562,8 @@ BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) { | |||||||||||
"Should already be a bypass block due to iteration count check"); | ||||||||||||
LoopBypassBlocks.push_back(SCEVCheckBlock); | ||||||||||||
AddedSafetyChecks = true; | ||||||||||||
|
||||||||||||
introduceCheckBlockInVPlan(Plan, SCEVCheckBlock); | ||||||||||||
return SCEVCheckBlock; | ||||||||||||
} | ||||||||||||
|
||||||||||||
|
@@ -2573,6 +2600,7 @@ BasicBlock *InnerLoopVectorizer::emitMemRuntimeChecks(BasicBlock *Bypass) { | |||||||||||
|
||||||||||||
AddedSafetyChecks = true; | ||||||||||||
|
||||||||||||
introduceCheckBlockInVPlan(Plan, MemCheckBlock); | ||||||||||||
return MemCheckBlock; | ||||||||||||
} | ||||||||||||
|
||||||||||||
|
@@ -7643,20 +7671,15 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan( | |||||||||||
OrigLoop->getHeader()->getContext()); | ||||||||||||
VPlanTransforms::optimizeForVFAndUF(BestVPlan, BestVF, BestUF, PSE); | ||||||||||||
|
||||||||||||
LLVM_DEBUG(dbgs() << "Executing best plan with VF=" << BestVF | ||||||||||||
<< ", UF=" << BestUF << '\n'); | ||||||||||||
BestVPlan.setName("Final VPlan"); | ||||||||||||
LLVM_DEBUG(BestVPlan.dump()); | ||||||||||||
|
||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moved later to VPlan::execute(), after is replaces VPBBtoVPIRBB. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, can also split off once we are happy |
||||||||||||
// Perform the actual loop transformation. | ||||||||||||
VPTransformState State(BestVF, BestUF, LI, DT, ILV.Builder, &ILV, &BestVPlan); | ||||||||||||
|
||||||||||||
// 0. Generate SCEV-dependent code into the preheader, including TripCount, | ||||||||||||
// before making any changes to the CFG. | ||||||||||||
if (!BestVPlan.getPreheader()->empty()) { | ||||||||||||
if (!BestVPlan.getEntry()->empty()) { | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: worth renaming getPreheader() <--> getEntry() separately (?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Undone for now, thanks |
||||||||||||
State.CFG.PrevBB = OrigLoop->getLoopPreheader(); | ||||||||||||
State.Builder.SetInsertPoint(OrigLoop->getLoopPreheader()->getTerminator()); | ||||||||||||
BestVPlan.getPreheader()->execute(&State); | ||||||||||||
BestVPlan.getEntry()->execute(&State); | ||||||||||||
} | ||||||||||||
if (!ILV.getTripCount()) | ||||||||||||
ILV.setTripCount(State.get(BestVPlan.getTripCount(), VPLane(0))); | ||||||||||||
|
@@ -7864,8 +7887,6 @@ EpilogueVectorizerMainLoop::emitIterationCountCheck(BasicBlock *Bypass, | |||||||||||
DT->getNode(Bypass)->getIDom()) && | ||||||||||||
"TC check is expected to dominate Bypass"); | ||||||||||||
|
||||||||||||
// Update dominator for Bypass. | ||||||||||||
DT->changeImmediateDominator(Bypass, TCCheckBlock); | ||||||||||||
LoopBypassBlocks.push_back(TCCheckBlock); | ||||||||||||
|
||||||||||||
// Save the trip count so we don't have to regenerate it in the | ||||||||||||
|
@@ -7880,6 +7901,12 @@ EpilogueVectorizerMainLoop::emitIterationCountCheck(BasicBlock *Bypass, | |||||||||||
setBranchWeights(BI, MinItersBypassWeights, /*IsExpected=*/false); | ||||||||||||
ReplaceInstWithInst(TCCheckBlock->getTerminator(), &BI); | ||||||||||||
|
||||||||||||
VPBlockBase *VectorPH = Plan.getVectorPreheader(); | ||||||||||||
VPBlockBase *PredVPB = VectorPH->getSinglePredecessor(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated, thanks! |
||||||||||||
if (PredVPB->getNumSuccessors() == 1) | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this distinction between having a single successor (VectorPH) and two (VectorPH and ... ScalarPH?) correspond to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes ForEpilogue can be checked instead, updated and added a comment, thanks! |
||||||||||||
connectScalarPreheaderInVPlan(Plan); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth connecting scalar preheader from the outset rather than here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are a few places that assume the scalar PH has a single predecessor, which would need to be updated. Could pull this in here or adjust as follow-up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Follow-up would be fine. |
||||||||||||
else | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth introducing TCCheckBlock also when it appears first, rather than "folding" it into Entry? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as in having the entry connect directly to TCCheckBlock? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's somewhat confusing to wrap TCCheckBlock only below under "else" but not above under "if", given that it is created in both cases. But this seems similar to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the check as suggested above and added a comment |
||||||||||||
introduceCheckBlockInVPlan(Plan, TCCheckBlock); | ||||||||||||
return TCCheckBlock; | ||||||||||||
} | ||||||||||||
|
||||||||||||
|
@@ -7910,9 +7937,6 @@ EpilogueVectorizerEpilogueLoop::createEpilogueVectorizedLoopSkeleton( | |||||||||||
EPI.MainLoopIterationCountCheck->getTerminator()->replaceUsesOfWith( | ||||||||||||
VecEpilogueIterationCountCheck, LoopVectorPreHeader); | ||||||||||||
|
||||||||||||
DT->changeImmediateDominator(LoopVectorPreHeader, | ||||||||||||
EPI.MainLoopIterationCountCheck); | ||||||||||||
|
||||||||||||
EPI.EpilogueIterationCountCheck->getTerminator()->replaceUsesOfWith( | ||||||||||||
VecEpilogueIterationCountCheck, LoopScalarPreHeader); | ||||||||||||
|
||||||||||||
|
@@ -7923,19 +7947,8 @@ EpilogueVectorizerEpilogueLoop::createEpilogueVectorizedLoopSkeleton( | |||||||||||
EPI.MemSafetyCheck->getTerminator()->replaceUsesOfWith( | ||||||||||||
VecEpilogueIterationCountCheck, LoopScalarPreHeader); | ||||||||||||
|
||||||||||||
DT->changeImmediateDominator( | ||||||||||||
VecEpilogueIterationCountCheck, | ||||||||||||
VecEpilogueIterationCountCheck->getSinglePredecessor()); | ||||||||||||
|
||||||||||||
DT->changeImmediateDominator(LoopScalarPreHeader, | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one still needed? (Review to be continued from here) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep this one is still needed for now. |
||||||||||||
EPI.EpilogueIterationCountCheck); | ||||||||||||
if (!Cost->requiresScalarEpilogue(EPI.EpilogueVF.isVector())) | ||||||||||||
// If there is an epilogue which must run, there's no edge from the | ||||||||||||
// middle block to exit blocks and thus no need to update the immediate | ||||||||||||
// dominator of the exit blocks. | ||||||||||||
DT->changeImmediateDominator(OrigLoop->getUniqueLatchExitBlock(), | ||||||||||||
EPI.EpilogueIterationCountCheck); | ||||||||||||
|
||||||||||||
// Keep track of bypass blocks, as they feed start values to the induction and | ||||||||||||
// reduction phis in the scalar loop preheader. | ||||||||||||
if (EPI.SCEVSafetyCheck) | ||||||||||||
|
@@ -8038,6 +8051,20 @@ EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck( | |||||||||||
} | ||||||||||||
ReplaceInstWithInst(Insert->getTerminator(), &BI); | ||||||||||||
LoopBypassBlocks.push_back(Insert); | ||||||||||||
|
||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Entry is conceptually an immutable VPIRBB that wraps the original scalar preheader, where SCEVs can be safely expanded, and all runtime checks are added as successors starting with minimal trip count check that is added as its terminal. Does this change when executing the VPlan of an epilog loop, whose Entry is replicated and transformed from old to new? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it changes conceptually, but here we need to wrap the new entry to be the new node created here, after executing the plan. (There is existing logic to re-use expanded SCEVs from the first execution of the main plan, to avoid duplicated expansions) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All VPlans are initialized with the original scalar preheader as their Entry, but this works only when vectorizing the main loop (w/ or w/o vectorizing the epilog). When vectorizing the epilog loop, Entry no longer serves to host SCEV-expand recipes but should instead refer to the block between Middle and "Vector Epilog PreHeader" (called "Vector Epilogue Trip Count Check", but that's also how the first block is named), as depicted in https://llvm.org/docs/Vectorizers.html#epilogue-vectorization. The documentation of Entry in createInitialVPlan() deserves update? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a comment, thanks! |
||||||||||||
// A new entry block has been created for the epilogue VPlan. Hook it in. | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be done using replaceVPBBWithIRVPBB()? Better merge NewEntry with OldEntry here, rather than insert NewEntry after OldEntry and let subsequent merging of VPBB pairs merge them, because the latter avoids handling VPIRBB's? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I've simplified the code here and there should be nothing to merge at this point, I dropped the loop to remove the loop with VPIRInstructions. It may be possible to use |
||||||||||||
VPIRBasicBlock *NewEntry = VPIRBasicBlock::fromBasicBlock(Insert); | ||||||||||||
VPBasicBlock *OldEntry = Plan.getEntry(); | ||||||||||||
VPBlockUtils::reassociateBlocks(OldEntry, NewEntry); | ||||||||||||
Plan.setEntry(NewEntry); | ||||||||||||
for (auto &R : make_early_inc_range(*NewEntry)) { | ||||||||||||
auto *VPIR = dyn_cast<VPIRInstruction>(&R); | ||||||||||||
if (!VPIR || !isa<PHINode>(VPIR->getInstruction())) | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Surely VPIR isn't null, having just created NewEntry fromBasicBlock Insert. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Code removed, thanks! |
||||||||||||
break; | ||||||||||||
VPIR->eraseFromParent(); | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Erasing all of Insert's phi's from NewEntry? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be empty, code remove, thanks! |
||||||||||||
} | ||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should OldEntry be deleted? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated to remove it, thanks! |
||||||||||||
|
||||||||||||
connectScalarPreheaderInVPlan(Plan); | ||||||||||||
return Insert; | ||||||||||||
} | ||||||||||||
|
||||||||||||
|
@@ -10259,7 +10286,7 @@ bool LoopVectorizePass::processLoop(Loop *L) { | |||||||||||
// should be removed once induction resume value creation is done | ||||||||||||
// directly in VPlan. | ||||||||||||
EpilogILV.setTripCount(MainILV.getTripCount()); | ||||||||||||
for (auto &R : make_early_inc_range(*BestEpiPlan.getPreheader())) { | ||||||||||||
for (auto &R : make_early_inc_range(*BestEpiPlan.getEntry())) { | ||||||||||||
auto *ExpandR = dyn_cast<VPExpandSCEVRecipe>(&R); | ||||||||||||
if (!ExpandR) | ||||||||||||
continue; | ||||||||||||
|
@@ -10319,8 +10346,6 @@ bool LoopVectorizePass::processLoop(Loop *L) { | |||||||||||
cast<VPHeaderPHIRecipe>(&R)->setStartValue(StartVal); | ||||||||||||
} | ||||||||||||
|
||||||||||||
assert(DT->verify(DominatorTree::VerificationLevel::Fast) && | ||||||||||||
"DT not preserved correctly"); | ||||||||||||
LVP.executePlan(EPI.EpilogueVF, EPI.EpilogueUF, BestEpiPlan, EpilogILV, | ||||||||||||
DT, true, &ExpandedSCEVs); | ||||||||||||
++LoopsEpilogueVectorized; | ||||||||||||
|
@@ -10348,6 +10373,9 @@ bool LoopVectorizePass::processLoop(Loop *L) { | |||||||||||
checkMixedPrecision(L, ORE); | ||||||||||||
} | ||||||||||||
|
||||||||||||
assert(DT->verify(DominatorTree::VerificationLevel::Fast) && | ||||||||||||
"DT not preserved correctly"); | ||||||||||||
|
||||||||||||
std::optional<MDNode *> RemainderLoopID = | ||||||||||||
makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll, | ||||||||||||
LLVMLoopVectorizeFollowupEpilogue}); | ||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -170,9 +170,8 @@ VPBasicBlock *VPBlockBase::getEntryBasicBlock() { | |||||||||||||
} | ||||||||||||||
|
||||||||||||||
void VPBlockBase::setPlan(VPlan *ParentPlan) { | ||||||||||||||
assert( | ||||||||||||||
(ParentPlan->getEntry() == this || ParentPlan->getPreheader() == this) && | ||||||||||||||
"Can only set plan on its entry or preheader block."); | ||||||||||||||
assert(ParentPlan->getEntry() == this && | ||||||||||||||
"Can only set plan on its entry or preheader block."); | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated, thanks! |
||||||||||||||
Plan = ParentPlan; | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
|
@@ -840,9 +839,6 @@ VPlan::~VPlan() { | |||||||||||||
Block->dropAllReferences(&DummyValue); | ||||||||||||||
|
||||||||||||||
VPBlockBase::deleteCFG(Entry); | ||||||||||||||
|
||||||||||||||
Preheader->dropAllReferences(&DummyValue); | ||||||||||||||
delete Preheader; | ||||||||||||||
} | ||||||||||||||
for (VPValue *VPV : VPLiveInsToFree) | ||||||||||||||
delete VPV; | ||||||||||||||
|
@@ -865,9 +861,10 @@ VPlanPtr VPlan::createInitialVPlan(Type *InductionTy, | |||||||||||||
VPIRBasicBlock *Entry = | ||||||||||||||
VPIRBasicBlock::fromBasicBlock(TheLoop->getLoopPreheader()); | ||||||||||||||
VPBasicBlock *VecPreheader = new VPBasicBlock("vector.ph"); | ||||||||||||||
VPBlockUtils::connectBlocks(Entry, VecPreheader); | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth noting that this connection is subject to insertion of runtime checks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a comment, thanks! |
||||||||||||||
VPIRBasicBlock *ScalarHeader = | ||||||||||||||
VPIRBasicBlock::fromBasicBlock(TheLoop->getHeader()); | ||||||||||||||
auto Plan = std::make_unique<VPlan>(Entry, VecPreheader, ScalarHeader); | ||||||||||||||
auto Plan = std::make_unique<VPlan>(Entry, ScalarHeader); | ||||||||||||||
|
||||||||||||||
// Create SCEV and VPValue for the trip count. | ||||||||||||||
|
||||||||||||||
|
@@ -1010,8 +1007,9 @@ void VPlan::execute(VPTransformState *State) { | |||||||||||||
BasicBlock *VectorPreHeader = State->CFG.PrevBB; | ||||||||||||||
State->Builder.SetInsertPoint(VectorPreHeader->getTerminator()); | ||||||||||||||
|
||||||||||||||
// Disconnect VectorPreHeader from ExitBB in both the CFG and DT. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Need to disconnect in DT only? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Retained both, as the DT update is required, (otherwise the sanity checks during DT updates will trigger) |
||||||||||||||
cast<BranchInst>(VectorPreHeader->getTerminator())->setSuccessor(0, nullptr); | ||||||||||||||
replaceVPBBWithIRVPBB( | ||||||||||||||
cast<VPBasicBlock>(getVectorLoopRegion()->getSinglePredecessor()), | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated thanks! |
||||||||||||||
VectorPreHeader); | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better placed next to the two replaceVPBBWithIRVPBB() below, and update the comment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Moved, thanks! |
||||||||||||||
State->CFG.DTU.applyUpdates( | ||||||||||||||
{{DominatorTree::Delete, VectorPreHeader, State->CFG.ExitBB}}); | ||||||||||||||
|
||||||||||||||
|
@@ -1025,6 +1023,11 @@ void VPlan::execute(VPTransformState *State) { | |||||||||||||
replaceVPBBWithIRVPBB(getScalarPreheader(), ScalarPh); | ||||||||||||||
replaceVPBBWithIRVPBB(MiddleVPBB, MiddleBB); | ||||||||||||||
|
||||||||||||||
LLVM_DEBUG(dbgs() << "Executing best plan with VF=" <<State->VF | ||||||||||||||
<< ", UF=" << getUF()<< '\n'); | ||||||||||||||
setName("Final VPlan"); | ||||||||||||||
LLVM_DEBUG(dump()); | ||||||||||||||
|
||||||||||||||
// Disconnect the middle block from its single successor (the scalar loop | ||||||||||||||
// header) in both the CFG and DT. The branch will be recreated during VPlan | ||||||||||||||
// execution. | ||||||||||||||
|
@@ -1038,8 +1041,10 @@ void VPlan::execute(VPTransformState *State) { | |||||||||||||
State->CFG.DTU.applyUpdates( | ||||||||||||||
{{DominatorTree::Delete, ScalarPh, ScalarPh->getSingleSuccessor()}}); | ||||||||||||||
|
||||||||||||||
ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<VPBlockBase *>> RPOT( | ||||||||||||||
Entry); | ||||||||||||||
// Generate code in the loop pre-header and body. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment should be updated - code for runtime guards has already been generated (is simply skipped here), but code is generated also after the body (middle, scalar pre-header). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated, thanks! |
||||||||||||||
for (VPBlockBase *Block : vp_depth_first_shallow(Entry)) | ||||||||||||||
for (VPBlockBase *Block : make_range(RPOT.begin(), RPOT.end())) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, earlier versions skipped the pre-header but that is not needed in the latest version, simplified, thanks |
||||||||||||||
Block->execute(State); | ||||||||||||||
|
||||||||||||||
VPBasicBlock *LatchVPBB = getVectorLoopRegion()->getExitingBasicBlock(); | ||||||||||||||
|
@@ -1090,9 +1095,6 @@ void VPlan::execute(VPTransformState *State) { | |||||||||||||
} | ||||||||||||||
|
||||||||||||||
State->CFG.DTU.flush(); | ||||||||||||||
assert(State->CFG.DTU.getDomTree().verify( | ||||||||||||||
DominatorTree::VerificationLevel::Fast) && | ||||||||||||||
"DT not preserved correctly"); | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
InstructionCost VPlan::cost(ElementCount VF, VPCostContext &Ctx) { | ||||||||||||||
|
@@ -1145,12 +1147,10 @@ void VPlan::print(raw_ostream &O) const { | |||||||||||||
|
||||||||||||||
printLiveIns(O); | ||||||||||||||
|
||||||||||||||
if (!getPreheader()->empty()) { | ||||||||||||||
O << "\n"; | ||||||||||||||
getPreheader()->print(O, "", SlotTracker); | ||||||||||||||
} | ||||||||||||||
ReversePostOrderTraversal<VPBlockShallowTraversalWrapper<const VPBlockBase *>> | ||||||||||||||
RPOT(getEntry()); | ||||||||||||||
|
||||||||||||||
for (const VPBlockBase *Block : vp_depth_first_shallow(getEntry())) { | ||||||||||||||
for (const VPBlockBase *Block : RPOT) { | ||||||||||||||
O << '\n'; | ||||||||||||||
Block->print(O, "", SlotTracker); | ||||||||||||||
} | ||||||||||||||
|
@@ -1181,6 +1181,20 @@ std::string VPlan::getName() const { | |||||||||||||
return Out; | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
VPRegionBlock *VPlan::getVectorLoopRegion() { | ||||||||||||||
for (VPBlockBase *B : vp_depth_first_shallow(getEntry())) | ||||||||||||||
if (auto *R = dyn_cast<VPRegionBlock>(B)) | ||||||||||||||
return R; | ||||||||||||||
return nullptr; | ||||||||||||||
Comment on lines
+1172
to
+1175
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better cache the result? Can be done as follow-up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Leave a TODO? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added thanks! |
||||||||||||||
} | ||||||||||||||
|
||||||||||||||
const VPRegionBlock *VPlan::getVectorLoopRegion() const { | ||||||||||||||
for (const VPBlockBase *B : vp_depth_first_shallow(getEntry())) | ||||||||||||||
if (auto *R = dyn_cast<VPRegionBlock>(B)) | ||||||||||||||
return R; | ||||||||||||||
return nullptr; | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
LLVM_DUMP_METHOD | ||||||||||||||
void VPlan::printDOT(raw_ostream &O) const { | ||||||||||||||
VPlanPrinter Printer(O, *this); | ||||||||||||||
|
@@ -1231,7 +1245,6 @@ static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry, | |||||||||||||
|
||||||||||||||
VPlan *VPlan::duplicate() { | ||||||||||||||
// Clone blocks. | ||||||||||||||
VPBasicBlock *NewPreheader = Preheader->clone(); | ||||||||||||||
const auto &[NewEntry, __] = cloneFrom(Entry); | ||||||||||||||
|
||||||||||||||
BasicBlock *ScalarHeaderIRBB = getScalarHeader()->getIRBasicBlock(); | ||||||||||||||
|
@@ -1241,8 +1254,7 @@ VPlan *VPlan::duplicate() { | |||||||||||||
return VPIRBB && VPIRBB->getIRBasicBlock() == ScalarHeaderIRBB; | ||||||||||||||
})); | ||||||||||||||
// Create VPlan, clone live-ins and remap operands in the cloned blocks. | ||||||||||||||
auto *NewPlan = | ||||||||||||||
new VPlan(NewPreheader, cast<VPBasicBlock>(NewEntry), NewScalarHeader); | ||||||||||||||
auto *NewPlan = new VPlan(cast<VPBasicBlock>(NewEntry), NewScalarHeader); | ||||||||||||||
DenseMap<VPValue *, VPValue *> Old2NewVPValues; | ||||||||||||||
for (VPValue *OldLiveIn : VPLiveInsToFree) { | ||||||||||||||
Old2NewVPValues[OldLiveIn] = | ||||||||||||||
|
@@ -1262,7 +1274,6 @@ VPlan *VPlan::duplicate() { | |||||||||||||
// else NewTripCount will be created and inserted into Old2NewVPValues when | ||||||||||||||
// TripCount is cloned. In any case NewPlan->TripCount is updated below. | ||||||||||||||
|
||||||||||||||
remapOperands(Preheader, NewPreheader, Old2NewVPValues); | ||||||||||||||
remapOperands(Entry, NewEntry, Old2NewVPValues); | ||||||||||||||
|
||||||||||||||
// Initialize remaining fields of cloned VPlan. | ||||||||||||||
|
@@ -1276,6 +1287,19 @@ VPlan *VPlan::duplicate() { | |||||||||||||
return NewPlan; | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
VPBasicBlock *VPlan::getScalarPreheader() { | ||||||||||||||
auto *MiddleVPBB = | ||||||||||||||
cast<VPBasicBlock>(getVectorLoopRegion()->getSingleSuccessor()); | ||||||||||||||
if (MiddleVPBB->getNumSuccessors() == 2) { | ||||||||||||||
// Order is strict: first is the exit block, second is the scalar preheader. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Simpler to return last successor, provided it ain't a VPIRBB? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Simplified, thanks |
||||||||||||||
return cast<VPBasicBlock>(MiddleVPBB->getSuccessors()[1]); | ||||||||||||||
} | ||||||||||||||
if (auto *IRVPBB = dyn_cast<VPBasicBlock>(MiddleVPBB->getSingleSuccessor())) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dyn_casting to VPBB but calling it
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated code as per comment above, thanks |
||||||||||||||
return IRVPBB; | ||||||||||||||
|
||||||||||||||
return nullptr; | ||||||||||||||
} | ||||||||||||||
|
||||||||||||||
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) | ||||||||||||||
|
||||||||||||||
Twine VPlanPrinter::getUID(const VPBlockBase *Block) { | ||||||||||||||
|
@@ -1314,8 +1338,6 @@ void VPlanPrinter::dump() { | |||||||||||||
OS << "edge [fontname=Courier, fontsize=30]\n"; | ||||||||||||||
OS << "compound=true\n"; | ||||||||||||||
|
||||||||||||||
dumpBlock(Plan.getPreheader()); | ||||||||||||||
|
||||||||||||||
for (const VPBlockBase *Block : vp_depth_first_shallow(Plan.getEntry())) | ||||||||||||||
dumpBlock(Block); | ||||||||||||||
|
||||||||||||||
|
@@ -1576,7 +1598,6 @@ void VPSlotTracker::assignNames(const VPlan &Plan) { | |||||||||||||
assignName(Plan.BackedgeTakenCount); | ||||||||||||||
for (VPValue *LI : Plan.VPLiveInsToFree) | ||||||||||||||
assignName(LI); | ||||||||||||||
assignNames(Plan.getPreheader()); | ||||||||||||||
|
||||||||||||||
ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<const VPBlockBase *>> | ||||||||||||||
RPOT(VPBlockDeepTraversalWrapper<const VPBlockBase *>(Plan.getEntry())); | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks!