-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[BPI] Cache LoopExitBlocks to improve compile time #93451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The `LoopBlock` stored in `LoopWorkList` consist of basic block and its loop data information. When iterate `LoopWorkList`, if estimated weight of a loop is not stored in `EstimatedLoopWeight`, `getLoopExitBlocks()` is called to get all exit blocks of the loop. The estimated weight of a loop is calculated by iterating over edges leading from basic block to all exit blocks of the loop. If at least one edge has unknown estimated weight, the estimated weight of loop is unknown and will not be stored in `EstimatedLoopWeight`. `LoopWorkList` can contain different blocks in a same loop, so there is wasted work that calls `getLoopExitBlocks()` for same loop multiple times. Since computing the exit blocks of loop is expensive and the loop structure is not mutated in Branch Probability Analysis, we can cache the result and improve compile time. With this change, the overall compile time for a file containing a very large loop is dropped by around 82%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me, but I'm not familiar with this analysis.
@@ -343,6 +343,9 @@ class BranchProbabilityInfo { | |||
/// Keeps mapping of a loop to estimated weight to enter the loop. | |||
SmallDenseMap<LoopData, uint32_t> EstimatedLoopWeight; | |||
|
|||
/// Keeps mapping of a Loop to its "exit" blocks. | |||
SmallDenseMap<LoopData, SmallVector<BasicBlock *, 4>> LoopExitBlocks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a local variable in computeEestimateBlockWeight? Just to make it clear that there are no invalidation concerns...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! done.
continue; | ||
|
||
SmallVector<BasicBlock *, 4> Exits; | ||
getLoopExitBlocks(LoopBB, Exits); | ||
if (!LoopExitBlocks.count(LD)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use find()
here to avoid the duplicate lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't quite get it.
What do you think of using try_emplace()
?
auto Res = LoopExitBlocks.try_emplace(LD);
SmallVectorImpl<BasicBlock *> &Exits = Res.first->second;
if (Res.second)
getLoopExitBlocks(LoopBB, Exits);
auto LoopWeight = getMaxEstimatedEdgeWeight(
LoopBB, make_range(Exits.begin(), Exits.end()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant using find() to avoid doing both the count() and then below another lookup using LoopExitBlocks[LD].
Your suggestion of using try_emplace also covers the insertion path, so I guess that's even better.
@llvm/pr-subscribers-llvm-analysis Author: Enna1 (Enna1) ChangesThe Since computing the exit blocks of loop is expensive and the loop structure is not mutated in Branch Probability Analysis, we can cache the result and improve compile time. With this change, the overall compile time for a file containing a very large loop is dropped by around 82%. Full diff: https://github.com/llvm/llvm-project/pull/93451.diff 2 Files Affected:
diff --git a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
index 91e1872e9bd6f..9713629e1bf02 100644
--- a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
+++ b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
@@ -343,6 +343,9 @@ class BranchProbabilityInfo {
/// Keeps mapping of a loop to estimated weight to enter the loop.
SmallDenseMap<LoopData, uint32_t> EstimatedLoopWeight;
+ /// Keeps mapping of a Loop to its "exit" blocks.
+ SmallDenseMap<LoopData, SmallVector<BasicBlock *, 4>> LoopExitBlocks;
+
/// Helper to construct LoopBlock for \p BB.
LoopBlock getLoopBlock(const BasicBlock *BB) const {
return LoopBlock(BB, *LI, *SccI.get());
diff --git a/llvm/lib/Analysis/BranchProbabilityInfo.cpp b/llvm/lib/Analysis/BranchProbabilityInfo.cpp
index cd3e3a4991327..4191fb64b88ec 100644
--- a/llvm/lib/Analysis/BranchProbabilityInfo.cpp
+++ b/llvm/lib/Analysis/BranchProbabilityInfo.cpp
@@ -828,12 +828,13 @@ void BranchProbabilityInfo::computeEestimateBlockWeight(
do {
while (!LoopWorkList.empty()) {
const LoopBlock LoopBB = LoopWorkList.pop_back_val();
-
- if (EstimatedLoopWeight.count(LoopBB.getLoopData()))
+ const LoopData LD = LoopBB.getLoopData();
+ if (EstimatedLoopWeight.count(LD))
continue;
- SmallVector<BasicBlock *, 4> Exits;
- getLoopExitBlocks(LoopBB, Exits);
+ if (!LoopExitBlocks.count(LD))
+ getLoopExitBlocks(LoopBB, LoopExitBlocks[LD]);
+ const SmallVectorImpl<BasicBlock *> &Exits = LoopExitBlocks[LD];
auto LoopWeight = getMaxEstimatedEdgeWeight(
LoopBB, make_range(Exits.begin(), Exits.end()));
@@ -842,7 +843,7 @@ void BranchProbabilityInfo::computeEestimateBlockWeight(
if (LoopWeight <= static_cast<uint32_t>(BlockExecWeight::UNREACHABLE))
LoopWeight = static_cast<uint32_t>(BlockExecWeight::LOWEST_NON_ZERO);
- EstimatedLoopWeight.insert({LoopBB.getLoopData(), *LoopWeight});
+ EstimatedLoopWeight.insert({LD, *LoopWeight});
// Add all blocks entering the loop into working list.
getLoopEnterBlocks(LoopBB, BlockWorkList);
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. cc @apolloww
@@ -828,12 +829,14 @@ void BranchProbabilityInfo::computeEestimateBlockWeight( | |||
do { | |||
while (!LoopWorkList.empty()) { | |||
const LoopBlock LoopBB = LoopWorkList.pop_back_val(); | |||
|
|||
if (EstimatedLoopWeight.count(LoopBB.getLoopData())) | |||
const LoopData LD = LoopBB.getLoopData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a const-reference?
The
LoopBlock
stored inLoopWorkList
consist of basic block and its loop data information. When iterateLoopWorkList
, if estimated weight of a loop is not stored inEstimatedLoopWeight
,getLoopExitBlocks()
is called to get all exit blocks of the loop. The estimated weight of a loop is calculated by iterating over edges leading from basic block to all exit blocks of the loop. If at least one edge has unknown estimated weight, the estimated weight of loop is unknown and will not be stored inEstimatedLoopWeight
.LoopWorkList
can contain different blocks in a same loop, so there is wasted work that callsgetLoopExitBlocks()
for same loop multiple times.Since computing the exit blocks of loop is expensive and the loop structure is not mutated in Branch Probability Analysis, we can cache the result and improve compile time.
With this change, the overall compile time for a file containing a very large loop is dropped by around 82%.