Skip to content

Commit

Permalink
[LoopSink] Exit loop finding BBs to sink into early when possible (NF…
Browse files Browse the repository at this point in the history
…C) (#101115)

As noted in the comments, findBBsToSinkInto is
   O(UseBBs.size() * ColdLoopBBs.size())

A very large function with a huge loop was incurring a high compile time
in this code. The size of the ColdLoopBBs set was over 14K. There is a
limit on the size of the UseBBs set, but not the ColdLoopBBs (and adding
a limit for the latter actually slowed down some later passes).

This change exits the loop early once we detect that there is no further
refinement possible for the BBsToSinkInto set. This is possible because
the ColdLoopBBs set is sorted in ascending magnitude of frequency.

This cut down the LoopSinkPass time by around 33% (78s to just over
50s).
  • Loading branch information
teresajohnson authored Jul 30, 2024
1 parent 9843843 commit 245e607
Showing 1 changed file with 16 additions and 0 deletions.
16 changes: 16 additions & 0 deletions llvm/lib/Transforms/Scalar/LoopSink.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,23 @@ findBBsToSinkInto(const Loop &L, const SmallPtrSetImpl<BasicBlock *> &UseBBs,
BBsToSinkInto.erase(DominatedBB);
}
BBsToSinkInto.insert(ColdestBB);
continue;
}
// Otherwise, see if we can stop the search through the cold BBs early.
// Since the ColdLoopBBs list is sorted in increasing magnitude of
// frequency the cold BB frequencies can only get larger. The
// BBsToSinkInto set can only get smaller and have a smaller
// adjustedSumFreq, due to the earlier checking. So once we find a cold BB
// with a frequency at least as large as the adjustedSumFreq of the
// current BBsToSinkInto set, the earlier frequency check can never be
// true for a future iteration. Note we could do check this more
// aggressively earlier, but in practice this ended up being more
// expensive overall (added checking to the critical path through the loop
// that often ended up continuing early due to an empty
// BBsDominatedByColdestBB set, and the frequency check there was false
// most of the time anyway).
if (adjustedSumFreq(BBsToSinkInto, BFI) <= BFI.getBlockFreq(ColdestBB))
break;
}

// Can't sink into blocks that have no valid insertion point.
Expand Down

0 comments on commit 245e607

Please sign in to comment.