Skip to content

[DomTreeUpdater] Handle critical edge splitting #100856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

paperchalice
Copy link
Contributor

@paperchalice paperchalice commented Jul 27, 2024

Unfortunately the performance of generic lazy updater is not so efficient for critical edge splitting (about +0.2%). This pull request move related codes from MachineDominatorTree to GenericDomTreeUpdater and let it support post dominator tree. This kind updates are not compatible with cfg update, so it need to submit all general updates.

Performance test: https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au

@paperchalice paperchalice force-pushed the dt-critical-edge branch 5 times, most recently from 2439bee to 5fa77d3 Compare July 29, 2024 09:06
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-llvm-regalloc
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-webassembly

@llvm/pr-subscribers-backend-amdgpu

Author: None (paperchalice)

Changes

Unfortunately the performance of generic lazy updater is not so efficient for critical edge splitting. This pull request move related codes from MachineDominatorTree to GenericDomTreeUpdater and let it support post dominator tree. This kind updates are not compatible with cfg update, so it need to submit all general updates.

Performance test: https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au


Patch is 51.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100856.diff

29 Files Affected:

  • (modified) llvm/include/llvm/Analysis/DomTreeUpdater.h (+3-3)
  • (modified) llvm/include/llvm/Analysis/GenericDomTreeUpdater.h (+53)
  • (modified) llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h (+141-2)
  • (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+16-8)
  • (modified) llvm/include/llvm/CodeGen/MachineDominators.h (+2-167)
  • (modified) llvm/include/llvm/CodeGen/MachineSSAContext.h (-6)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/LazyMachineBlockFrequencyInfo.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/LiveDebugValues/LiveDebugValues.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineBasicBlock.cpp (+5-3)
  • (modified) llvm/lib/CodeGen/MachineDomTreeUpdater.cpp (+1)
  • (modified) llvm/lib/CodeGen/MachineDominanceFrontier.cpp (+1-2)
  • (modified) llvm/lib/CodeGen/MachineDominators.cpp (-74)
  • (modified) llvm/lib/CodeGen/MachineLICM.cpp (+17-12)
  • (modified) llvm/lib/CodeGen/MachineLoopInfo.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+5-1)
  • (modified) llvm/lib/CodeGen/MachineUniformityAnalysis.cpp (+1-2)
  • (modified) llvm/lib/CodeGen/PHIElimination.cpp (+18-9)
  • (modified) llvm/lib/CodeGen/XRayInstrumentation.cpp (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankSelect.cpp (+2-3)
  • (modified) llvm/lib/Target/AMDGPU/SILateBranchLowering.cpp (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp (+1-1)
  • (modified) llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/X86FlagsCopyLowering.cpp (+1-2)
  • (modified) llvm/tools/llvm-reduce/deltas/ReduceInstructionsMIR.cpp (+1-1)
  • (modified) llvm/unittests/Analysis/DomTreeUpdaterTest.cpp (+43)
  • (modified) llvm/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp (+4-4)
diff --git a/llvm/include/llvm/Analysis/DomTreeUpdater.h b/llvm/include/llvm/Analysis/DomTreeUpdater.h
index e07d2bf2b0df1..99ebe8d7923bf 100644
--- a/llvm/include/llvm/Analysis/DomTreeUpdater.h
+++ b/llvm/include/llvm/Analysis/DomTreeUpdater.h
@@ -83,6 +83,9 @@ class DomTreeUpdater
 
   ///@}
 
+  /// Debug method to help view the internal state of this class.
+  LLVM_DUMP_METHOD void dump() const;
+
 private:
   class CallBackOnDeletion final : public CallbackVH {
   public:
@@ -111,9 +114,6 @@ class DomTreeUpdater
 
   /// Returns true if at least one BasicBlock is deleted.
   bool forceFlushDeletedBB();
-
-  /// Debug method to help view the internal state of this class.
-  LLVM_DUMP_METHOD void dump() const;
 };
 
 extern template class GenericDomTreeUpdater<DomTreeUpdater, DominatorTree,
diff --git a/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h b/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
index 84ed882c6de84..4fbeab80ec933 100644
--- a/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
+++ b/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
@@ -105,6 +105,12 @@ class GenericDomTreeUpdater {
     return PendUpdates.size() != PendPDTUpdateIndex;
   }
 
+  bool hasPendingCriticalEdges() const {
+    if (!PDT && !DT)
+      return false;
+    return !CriticalEdgesToSplit.empty();
+  }
+
   ///@{
   /// \name Mutation APIs
   ///
@@ -146,8 +152,25 @@ class GenericDomTreeUpdater {
   /// 2. It is illegal to submit any update that has already been submitted,
   /// i.e., you are supposed not to insert an existent edge or delete a
   /// nonexistent edge.
+  /// 3. This kind updates are incompatible with critical edge splitting
+  /// updates, call this method will apply all critical edge updates in
+  /// lazy mode. It is not recommended to interleave applyUpdates and
+  /// applyUpdatesForCriticalEdgeSplitting.
   void applyUpdates(ArrayRef<typename DomTreeT::UpdateType> Updates);
 
+  /// Apply updates that the critical edge (FromBB, ToBB) has been
+  /// split with NewBB.
+  ///
+  /// \note Do not use this method with regular edges.
+  ///
+  /// \note This kind updates are incompatible with generic updates,
+  /// call this method will submit all generic updates in lazy mode.
+  /// It is not recommended to interleave applyUpdates and
+  /// applyUpdatesForCriticalEdgeSplitting.
+  void applyUpdatesForCriticalEdgeSplitting(BasicBlockT *FromBB,
+                                            BasicBlockT *ToBB,
+                                            BasicBlockT *NewBB);
+
   /// Submit updates to all available trees. It will also
   /// 1. discard duplicated updates,
   /// 2. remove invalid updates. (Invalid updates means deletion of an edge that
@@ -197,6 +220,7 @@ class GenericDomTreeUpdater {
     applyDomTreeUpdates();
     applyPostDomTreeUpdates();
     dropOutOfDateUpdates();
+    applySplitCriticalEdges();
   }
 
   ///@}
@@ -243,6 +267,35 @@ class GenericDomTreeUpdater {
   /// Drop all updates applied by all available trees and delete BasicBlocks if
   /// all available trees are up-to-date.
   void dropOutOfDateUpdates();
+
+private:
+  /// Helper structure used to hold all the basic blocks
+  /// involved in the split of a critical edge.
+  struct CriticalEdge {
+    BasicBlockT *FromBB;
+    BasicBlockT *ToBB;
+    BasicBlockT *NewBB;
+  };
+
+  /// Pile up all the critical edges to be split.
+  /// The splitting of a critical edge is local and thus, it is possible
+  /// to apply several of those changes at the same time.
+  SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
+
+  /// Remember all the basic blocks that are inserted during
+  /// edge splitting.
+  /// Invariant: NewBBs == all the basic blocks contained in the NewBB
+  /// field of all the elements of CriticalEdgesToSplit.
+  /// I.e., forall elt in CriticalEdgesToSplit, it exists BB in NewBBs
+  /// such as BB == elt.NewBB.
+  SmallSet<BasicBlockT *, 32> NewBBs;
+
+  /// Apply all the recorded critical edges to the DT.
+  /// This updates the underlying DT information in a way that uses
+  /// the fast query path of DT as much as possible.
+  ///
+  /// \post CriticalEdgesToSplit.empty().
+  void applySplitCriticalEdges();
 };
 
 } // namespace llvm
diff --git a/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h b/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
index 35c54a6062bfd..eafc7244160aa 100644
--- a/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
+++ b/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
@@ -16,6 +16,7 @@
 #ifndef LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
 #define LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
 
+#include "llvm/ADT/SmallBitVector.h"
 #include "llvm/Analysis/GenericDomTreeUpdater.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/raw_ostream.h"
@@ -61,6 +62,9 @@ void GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::applyUpdates(
     return;
 
   if (Strategy == UpdateStrategy::Lazy) {
+    if (!CriticalEdgesToSplit.empty())
+      applySplitCriticalEdges();
+
     PendUpdates.reserve(PendUpdates.size() + Updates.size());
     for (const auto &U : Updates)
       if (!isSelfDominance(U))
@@ -136,6 +140,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::getDomTree() {
   assert(DT && "Invalid acquisition of a null DomTree");
   applyDomTreeUpdates();
   dropOutOfDateUpdates();
+  applySplitCriticalEdges();
   return *DT;
 }
 
@@ -145,6 +150,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::getPostDomTree() {
   assert(PDT && "Invalid acquisition of a null PostDomTree");
   applyPostDomTreeUpdates();
   dropOutOfDateUpdates();
+  applySplitCriticalEdges();
   return *PDT;
 }
 
@@ -201,7 +207,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
           if (To) {
             auto S = To->getName();
             if (!To->hasName())
-              S = "(no_name)";
+              S = "(no name)";
             OS << S << "(" << To << ")\n";
           } else {
             OS << "(badref)\n";
@@ -229,6 +235,22 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
     printUpdates(I, PendUpdates.end());
   }
 
+  auto printCriticalEdges = [&](const CriticalEdge &E) {
+    auto FromName = E.FromBB->getName();
+    if (!E.FromBB->hasName())
+      FromName = "(no name)";
+    auto NewName = E.NewBB->getName();
+    if (!E.NewBB->hasName())
+      NewName = "(no name)";
+    auto ToName = E.ToBB->getName();
+    if (!E.ToBB->hasName())
+      ToName = "(no name)";
+    OS << "  " << FromName << ", " << NewName << ", " << ToName << '\n';
+  };
+  OS << "Critical edges to be split:\n";
+  for (const auto &E : CriticalEdgesToSplit)
+    printCriticalEdges(E);
+
   OS << "Pending DeletedBBs:\n";
   Index = 0;
   for (const auto *BB : DeletedBBs) {
@@ -237,7 +259,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
     if (BB->hasName())
       OS << BB->getName() << "(";
     else
-      OS << "(no_name)(";
+      OS << "(no name)(";
     OS << BB << ")\n";
   }
 #endif
@@ -348,6 +370,123 @@ void GenericDomTreeUpdater<DerivedT, DomTreeT,
   PendPDTUpdateIndex -= dropIndex;
 }
 
+template <typename DerivedT, typename DomTreeT, typename PostDomTreeT>
+void GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::
+    applyUpdatesForCriticalEdgeSplitting(BasicBlockT *FromBB, BasicBlockT *ToBB,
+                                         BasicBlockT *NewBB) {
+  if (!DT && !PDT)
+    return;
+  CriticalEdgesToSplit.push_back({FromBB, ToBB, NewBB});
+  bool Inserted = NewBBs.insert(NewBB).second;
+  (void)Inserted;
+  assert(Inserted &&
+         "A basic block inserted via edge splitting cannot appear twice");
+  if (Strategy == UpdateStrategy::Lazy) {
+    applyDomTreeUpdates();
+    applyPostDomTreeUpdates();
+  }
+  if (Strategy == UpdateStrategy::Eager)
+    applySplitCriticalEdges();
+}
+
+template <typename DerivedT, typename DomTreeT, typename PostDomTreeT>
+void GenericDomTreeUpdater<DerivedT, DomTreeT,
+                           PostDomTreeT>::applySplitCriticalEdges() {
+  // Bail out early if there is nothing to do.
+  if (CriticalEdgesToSplit.empty())
+    return;
+
+  // For each element in CriticalEdgesToSplit, remember whether or not element
+  // is the new immediate domminator of its successor. The mapping is done by
+  // index, i.e., the information for the ith element of CriticalEdgesToSplit is
+  // the ith element of IsNewIDom.
+  SmallBitVector IsNewIDom(CriticalEdgesToSplit.size(), true);
+  SmallBitVector IsNewIPDom(CriticalEdgesToSplit.size(), true);
+  size_t Idx = 0;
+
+  // Collect all the dominance properties info, before invalidating
+  // the underlying DT.
+  for (CriticalEdge &Edge : CriticalEdgesToSplit) {
+    // Update dominator information.
+    if (DT) {
+      BasicBlockT *Succ = Edge.ToBB;
+      auto *SuccDTNode = DT->getNode(Succ);
+
+      for (BasicBlockT *PredBB : predecessors(Succ)) {
+        if (PredBB == Edge.NewBB)
+          continue;
+        // If we are in this situation:
+        // FromBB1        FromBB2
+        //    +              +
+        //   + +            + +
+        //  +   +          +   +
+        // ...  Split1  Split2 ...
+        //           +   +
+        //            + +
+        //             +
+        //            Succ
+        // Instead of checking the domiance property with Split2, we check it
+        // with FromBB2 since Split2 is still unknown of the underlying DT
+        // structure.
+        if (NewBBs.count(PredBB)) {
+          assert(pred_size(PredBB) == 1 && "A basic block resulting from a "
+                                           "critical edge split has more "
+                                           "than one predecessor!");
+          PredBB = *pred_begin(PredBB);
+        }
+        if (!DT->dominates(SuccDTNode, DT->getNode(PredBB))) {
+          IsNewIDom[Idx] = false;
+          break;
+        }
+      }
+    }
+
+    // Same as DT version but from another direction.
+    if (PDT) {
+      BasicBlockT *Pred = Edge.FromBB;
+      auto *PredDTNode = PDT->getNode(Pred);
+      for (BasicBlockT *SuccBB : successors(Pred)) {
+        if (SuccBB == Edge.NewBB)
+          continue;
+        if (NewBBs.count(SuccBB)) {
+          assert(succ_size(SuccBB) == 1 && "A basic block resulting from a "
+                                           "critical edge split has more "
+                                           "than one predecessor!");
+          SuccBB = *succ_begin(SuccBB);
+        }
+        if (!PDT->dominates(PredDTNode, PDT->getNode(SuccBB))) {
+          IsNewIPDom[Idx] = false;
+          break;
+        }
+      }
+    }
+    ++Idx;
+  }
+
+  // Now, update DT with the collected dominance properties info.
+  Idx = 0;
+  for (CriticalEdge &Edge : CriticalEdgesToSplit) {
+    if (DT) {
+      // We know FromBB dominates NewBB.
+      auto *NewDTNode = DT->addNewBlock(Edge.NewBB, Edge.FromBB);
+
+      // If all the other predecessors of "Succ" are dominated by "Succ" itself
+      // then the new block is the new immediate dominator of "Succ". Otherwise,
+      // the new block doesn't dominate anything.
+      if (IsNewIDom[Idx])
+        DT->changeImmediateDominator(DT->getNode(Edge.ToBB), NewDTNode);
+    }
+    if (PDT) {
+      auto *NewPDTNode = PDT->addNewBlock(Edge.NewBB, Edge.ToBB);
+      if (IsNewIPDom[Idx])
+        PDT->changeImmediateDominator(PDT->getNode(Edge.FromBB), NewPDTNode);
+    }
+    ++Idx;
+  }
+  NewBBs.clear();
+  CriticalEdgesToSplit.clear();
+}
+
 } // namespace llvm
 
 #endif // LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index b8153fd5d3fb7..ca47dbace8c0a 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -32,6 +32,7 @@
 namespace llvm {
 
 class BasicBlock;
+class MachineDomTreeUpdater;
 class MachineFunction;
 class MCSymbol;
 class ModuleSlotTracker;
@@ -970,15 +971,17 @@ class MachineBasicBlock
   /// MachineLoopInfo, as applicable.
   MachineBasicBlock *
   SplitCriticalEdge(MachineBasicBlock *Succ, Pass &P,
-                    std::vector<SparseBitVector<>> *LiveInSets = nullptr) {
-    return SplitCriticalEdge(Succ, &P, nullptr, LiveInSets);
+                    std::vector<SparseBitVector<>> *LiveInSets = nullptr,
+                    MachineDomTreeUpdater *MDTU = nullptr) {
+    return SplitCriticalEdge(Succ, &P, nullptr, LiveInSets, MDTU);
   }
 
   MachineBasicBlock *
   SplitCriticalEdge(MachineBasicBlock *Succ,
                     MachineFunctionAnalysisManager &MFAM,
-                    std::vector<SparseBitVector<>> *LiveInSets = nullptr) {
-    return SplitCriticalEdge(Succ, nullptr, &MFAM, LiveInSets);
+                    std::vector<SparseBitVector<>> *LiveInSets = nullptr,
+                    MachineDomTreeUpdater *MDTU = nullptr) {
+    return SplitCriticalEdge(Succ, nullptr, &MFAM, LiveInSets, MDTU);
   }
 
   /// Check if the edge between this block and the given successor \p
@@ -1256,10 +1259,9 @@ class MachineBasicBlock
   void removePredecessor(MachineBasicBlock *Pred);
 
   // Helper method for new pass manager migration.
-  MachineBasicBlock *
-  SplitCriticalEdge(MachineBasicBlock *Succ, Pass *P,
-                    MachineFunctionAnalysisManager *MFAM,
-                    std::vector<SparseBitVector<>> *LiveInSets);
+  MachineBasicBlock *SplitCriticalEdge(
+      MachineBasicBlock *Succ, Pass *P, MachineFunctionAnalysisManager *MFAM,
+      std::vector<SparseBitVector<>> *LiveInSets, MachineDomTreeUpdater *MDTU);
 };
 
 raw_ostream& operator<<(raw_ostream &OS, const MachineBasicBlock &MBB);
@@ -1341,6 +1343,12 @@ inline auto successors(const MachineBasicBlock *BB) { return BB->successors(); }
 inline auto predecessors(const MachineBasicBlock *BB) {
   return BB->predecessors();
 }
+inline auto succ_size(const MachineBasicBlock *BB) { return BB->succ_size(); }
+inline auto pred_size(const MachineBasicBlock *BB) { return BB->pred_size(); }
+inline auto succ_begin(const MachineBasicBlock *BB) { return BB->succ_begin(); }
+inline auto pred_begin(const MachineBasicBlock *BB) { return BB->pred_begin(); }
+inline auto succ_end(const MachineBasicBlock *BB) { return BB->succ_end(); }
+inline auto pred_end(const MachineBasicBlock *BB) { return BB->pred_end(); }
 
 /// MachineInstrSpan provides an interface to get an iteration range
 /// containing the instruction it was initialized with, along with all
diff --git a/llvm/include/llvm/CodeGen/MachineDominators.h b/llvm/include/llvm/CodeGen/MachineDominators.h
index 74cf94398736d..61635ff64502d 100644
--- a/llvm/include/llvm/CodeGen/MachineDominators.h
+++ b/llvm/include/llvm/CodeGen/MachineDominators.h
@@ -73,86 +73,22 @@ extern template bool Verify<MBBDomTree>(const MBBDomTree &DT,
 /// compute a normal dominator tree.
 ///
 class MachineDominatorTree : public DomTreeBase<MachineBasicBlock> {
-  /// Helper structure used to hold all the basic blocks
-  /// involved in the split of a critical edge.
-  struct CriticalEdge {
-    MachineBasicBlock *FromBB;
-    MachineBasicBlock *ToBB;
-    MachineBasicBlock *NewBB;
-  };
-
-  /// Pile up all the critical edges to be split.
-  /// The splitting of a critical edge is local and thus, it is possible
-  /// to apply several of those changes at the same time.
-  mutable SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
-
-  /// Remember all the basic blocks that are inserted during
-  /// edge splitting.
-  /// Invariant: NewBBs == all the basic blocks contained in the NewBB
-  /// field of all the elements of CriticalEdgesToSplit.
-  /// I.e., forall elt in CriticalEdgesToSplit, it exists BB in NewBBs
-  /// such as BB == elt.NewBB.
-  mutable SmallSet<MachineBasicBlock *, 32> NewBBs;
-
-  /// Apply all the recorded critical edges to the DT.
-  /// This updates the underlying DT information in a way that uses
-  /// the fast query path of DT as much as possible.
-  /// FIXME: This method should not be a const member!
-  ///
-  /// \post CriticalEdgesToSplit.empty().
-  void applySplitCriticalEdges() const;
 
 public:
   using Base = DomTreeBase<MachineBasicBlock>;
 
   MachineDominatorTree() = default;
-  explicit MachineDominatorTree(MachineFunction &MF) { calculate(MF); }
+  explicit MachineDominatorTree(MachineFunction &MF) { recalculate(MF); }
 
   /// Handle invalidation explicitly.
   bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
                   MachineFunctionAnalysisManager::Invalidator &);
 
-  // FIXME: If there is an updater for MachineDominatorTree,
-  // migrate to this updater and remove these wrappers.
-
-  MachineDominatorTree &getBase() {
-    applySplitCriticalEdges();
-    return *this;
-  }
-
-  MachineBasicBlock *getRoot() const {
-    applySplitCriticalEdges();
-    return Base::getRoot();
-  }
-
-  MachineDomTreeNode *getRootNode() const {
-    applySplitCriticalEdges();
-    return const_cast<MachineDomTreeNode *>(Base::getRootNode());
-  }
-
-  void calculate(MachineFunction &F);
-
-  bool dominates(const MachineDomTreeNode *A,
-                 const MachineDomTreeNode *B) const {
-    applySplitCriticalEdges();
-    return Base::dominates(A, B);
-  }
-
-  void getDescendants(MachineBasicBlock *A,
-                      SmallVectorImpl<MachineBasicBlock *> &Result) {
-    applySplitCriticalEdges();
-    Base::getDescendants(A, Result);
-  }
-
-  bool dominates(const MachineBasicBlock *A, const MachineBasicBlock *B) const {
-    applySplitCriticalEdges();
-    return Base::dominates(A, B);
-  }
+  using Base::dominates;
 
   // dominates - Return true if A dominates B. This performs the
   // special checks necessary if A and B are in the same basic block.
   bool dominates(const MachineInstr *A, const MachineInstr *B) const {
-    applySplitCriticalEdges();
     const MachineBasicBlock *BBA = A->getParent(), *BBB = B->getParent();
     if (BBA != BBB)
       return Base::dominates(BBA, BBB);
@@ -164,107 +100,6 @@ class MachineDominatorTree : public DomTreeBase<MachineBasicBlock> {
 
     return &*I == A;
   }
-
-  bool properlyDominates(const MachineDomTreeNode *A,
-                         const MachineDomTreeNode *B) const {
-    applySplitCriticalEdges();
-    return Base::properlyDominates(A, B);
-  }
-
-  bool properlyDominates(const MachineBasicBlock *A,
-                         const MachineBasicBlock *B) const {
-    applySplitCriticalEdges();
-    return Base::properlyDominates(A, B);
-  }
-
-  /// findNearestCommonDominator - Find nearest common dominator basic block
-  /// for basic block A and B. If there is no such block then return NULL.
-  MachineBasicBlock *findNearestCommonDominator(MachineBasicBlock *A,
-                                                MachineBasicBlock *B) {
-    applySplitCriticalEdges();
-    return Base::findNearestCommonDominator(A, B);
-  }
-
-  MachineDomTreeNode *operator[](MachineBasicBlock *BB) const {
-    applySplitCriticalEdges();
-    return Base::getNode(BB);
-  }
-
-  /// getNode - return the (Post)DominatorTree node for the specified basic
-  /// block.  This is the same as using operator[] on this class.
-  ///
-  MachineDomTreeNode *getNode(MachineBasicBlock *BB) const {
-    applySplitCriticalEdges();
-    return Base::getNode(BB);
-  }
-
-  /// addNewBlock - Add a new node to the dominator tree information.  This
-  /// creates a new node as a child of DomBB dominator node,linking it into
-  /// the children list of the immediate dominator.
-  MachineDomTreeNode *addNewBlock(MachineBasicBlock *BB,
-                                  MachineBasicBlock *DomBB) {
-    applySplitCriticalEdges();
-    return Base::addNewBlock(BB, DomBB);
-  }
-
-  /// changeImmediateDominator - This method is used to update the dominator
-  /// tree information when a node's immediate dominator changes.
-  ///
-  void changeImmediateDominator(MachineBasicBlock *N,
-                                MachineBasicBlock *NewIDom) {
-    applySplitCriticalEdges();
-    Base::changeImmediateDominator(N, NewIDom);
-  }
-
-  void changeImmediateDominator(MachineDomTreeNode *N,
-                                MachineDomTreeNode *NewIDom) {
-    applySplitCriticalEdges();
-    Base::changeImmediateDominator(N, NewIDom);
-  }
-
-  /// eraseNode - Removes a node from  the dominator tree. Block must not
-  /// dominate any other blocks. Re...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-backend-hexagon

Author: None (paperchalice)

Changes

Unfortunately the performance of generic lazy updater is not so efficient for critical edge splitting. This pull request move related codes from MachineDominatorTree to GenericDomTreeUpdater and let it support post dominator tree. This kind updates are not compatible with cfg update, so it need to submit all general updates.

Performance test: https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&amp;to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&amp;stat=instructions%3Au


Patch is 51.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100856.diff

29 Files Affected:

  • (modified) llvm/include/llvm/Analysis/DomTreeUpdater.h (+3-3)
  • (modified) llvm/include/llvm/Analysis/GenericDomTreeUpdater.h (+53)
  • (modified) llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h (+141-2)
  • (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+16-8)
  • (modified) llvm/include/llvm/CodeGen/MachineDominators.h (+2-167)
  • (modified) llvm/include/llvm/CodeGen/MachineSSAContext.h (-6)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/LazyMachineBlockFrequencyInfo.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/LiveDebugValues/LiveDebugValues.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineBasicBlock.cpp (+5-3)
  • (modified) llvm/lib/CodeGen/MachineDomTreeUpdater.cpp (+1)
  • (modified) llvm/lib/CodeGen/MachineDominanceFrontier.cpp (+1-2)
  • (modified) llvm/lib/CodeGen/MachineDominators.cpp (-74)
  • (modified) llvm/lib/CodeGen/MachineLICM.cpp (+17-12)
  • (modified) llvm/lib/CodeGen/MachineLoopInfo.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineSink.cpp (+5-1)
  • (modified) llvm/lib/CodeGen/MachineUniformityAnalysis.cpp (+1-2)
  • (modified) llvm/lib/CodeGen/PHIElimination.cpp (+18-9)
  • (modified) llvm/lib/CodeGen/XRayInstrumentation.cpp (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPURegBankSelect.cpp (+2-3)
  • (modified) llvm/lib/Target/AMDGPU/SILateBranchLowering.cpp (+3-3)
  • (modified) llvm/lib/Target/AMDGPU/SILowerI1Copies.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp (+1-1)
  • (modified) llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/X86FlagsCopyLowering.cpp (+1-2)
  • (modified) llvm/tools/llvm-reduce/deltas/ReduceInstructionsMIR.cpp (+1-1)
  • (modified) llvm/unittests/Analysis/DomTreeUpdaterTest.cpp (+43)
  • (modified) llvm/unittests/Target/WebAssembly/WebAssemblyExceptionInfoTest.cpp (+4-4)
diff --git a/llvm/include/llvm/Analysis/DomTreeUpdater.h b/llvm/include/llvm/Analysis/DomTreeUpdater.h
index e07d2bf2b0df1..99ebe8d7923bf 100644
--- a/llvm/include/llvm/Analysis/DomTreeUpdater.h
+++ b/llvm/include/llvm/Analysis/DomTreeUpdater.h
@@ -83,6 +83,9 @@ class DomTreeUpdater
 
   ///@}
 
+  /// Debug method to help view the internal state of this class.
+  LLVM_DUMP_METHOD void dump() const;
+
 private:
   class CallBackOnDeletion final : public CallbackVH {
   public:
@@ -111,9 +114,6 @@ class DomTreeUpdater
 
   /// Returns true if at least one BasicBlock is deleted.
   bool forceFlushDeletedBB();
-
-  /// Debug method to help view the internal state of this class.
-  LLVM_DUMP_METHOD void dump() const;
 };
 
 extern template class GenericDomTreeUpdater<DomTreeUpdater, DominatorTree,
diff --git a/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h b/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
index 84ed882c6de84..4fbeab80ec933 100644
--- a/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
+++ b/llvm/include/llvm/Analysis/GenericDomTreeUpdater.h
@@ -105,6 +105,12 @@ class GenericDomTreeUpdater {
     return PendUpdates.size() != PendPDTUpdateIndex;
   }
 
+  bool hasPendingCriticalEdges() const {
+    if (!PDT && !DT)
+      return false;
+    return !CriticalEdgesToSplit.empty();
+  }
+
   ///@{
   /// \name Mutation APIs
   ///
@@ -146,8 +152,25 @@ class GenericDomTreeUpdater {
   /// 2. It is illegal to submit any update that has already been submitted,
   /// i.e., you are supposed not to insert an existent edge or delete a
   /// nonexistent edge.
+  /// 3. This kind updates are incompatible with critical edge splitting
+  /// updates, call this method will apply all critical edge updates in
+  /// lazy mode. It is not recommended to interleave applyUpdates and
+  /// applyUpdatesForCriticalEdgeSplitting.
   void applyUpdates(ArrayRef<typename DomTreeT::UpdateType> Updates);
 
+  /// Apply updates that the critical edge (FromBB, ToBB) has been
+  /// split with NewBB.
+  ///
+  /// \note Do not use this method with regular edges.
+  ///
+  /// \note This kind updates are incompatible with generic updates,
+  /// call this method will submit all generic updates in lazy mode.
+  /// It is not recommended to interleave applyUpdates and
+  /// applyUpdatesForCriticalEdgeSplitting.
+  void applyUpdatesForCriticalEdgeSplitting(BasicBlockT *FromBB,
+                                            BasicBlockT *ToBB,
+                                            BasicBlockT *NewBB);
+
   /// Submit updates to all available trees. It will also
   /// 1. discard duplicated updates,
   /// 2. remove invalid updates. (Invalid updates means deletion of an edge that
@@ -197,6 +220,7 @@ class GenericDomTreeUpdater {
     applyDomTreeUpdates();
     applyPostDomTreeUpdates();
     dropOutOfDateUpdates();
+    applySplitCriticalEdges();
   }
 
   ///@}
@@ -243,6 +267,35 @@ class GenericDomTreeUpdater {
   /// Drop all updates applied by all available trees and delete BasicBlocks if
   /// all available trees are up-to-date.
   void dropOutOfDateUpdates();
+
+private:
+  /// Helper structure used to hold all the basic blocks
+  /// involved in the split of a critical edge.
+  struct CriticalEdge {
+    BasicBlockT *FromBB;
+    BasicBlockT *ToBB;
+    BasicBlockT *NewBB;
+  };
+
+  /// Pile up all the critical edges to be split.
+  /// The splitting of a critical edge is local and thus, it is possible
+  /// to apply several of those changes at the same time.
+  SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
+
+  /// Remember all the basic blocks that are inserted during
+  /// edge splitting.
+  /// Invariant: NewBBs == all the basic blocks contained in the NewBB
+  /// field of all the elements of CriticalEdgesToSplit.
+  /// I.e., forall elt in CriticalEdgesToSplit, it exists BB in NewBBs
+  /// such as BB == elt.NewBB.
+  SmallSet<BasicBlockT *, 32> NewBBs;
+
+  /// Apply all the recorded critical edges to the DT.
+  /// This updates the underlying DT information in a way that uses
+  /// the fast query path of DT as much as possible.
+  ///
+  /// \post CriticalEdgesToSplit.empty().
+  void applySplitCriticalEdges();
 };
 
 } // namespace llvm
diff --git a/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h b/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
index 35c54a6062bfd..eafc7244160aa 100644
--- a/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
+++ b/llvm/include/llvm/Analysis/GenericDomTreeUpdaterImpl.h
@@ -16,6 +16,7 @@
 #ifndef LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
 #define LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
 
+#include "llvm/ADT/SmallBitVector.h"
 #include "llvm/Analysis/GenericDomTreeUpdater.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/raw_ostream.h"
@@ -61,6 +62,9 @@ void GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::applyUpdates(
     return;
 
   if (Strategy == UpdateStrategy::Lazy) {
+    if (!CriticalEdgesToSplit.empty())
+      applySplitCriticalEdges();
+
     PendUpdates.reserve(PendUpdates.size() + Updates.size());
     for (const auto &U : Updates)
       if (!isSelfDominance(U))
@@ -136,6 +140,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::getDomTree() {
   assert(DT && "Invalid acquisition of a null DomTree");
   applyDomTreeUpdates();
   dropOutOfDateUpdates();
+  applySplitCriticalEdges();
   return *DT;
 }
 
@@ -145,6 +150,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::getPostDomTree() {
   assert(PDT && "Invalid acquisition of a null PostDomTree");
   applyPostDomTreeUpdates();
   dropOutOfDateUpdates();
+  applySplitCriticalEdges();
   return *PDT;
 }
 
@@ -201,7 +207,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
           if (To) {
             auto S = To->getName();
             if (!To->hasName())
-              S = "(no_name)";
+              S = "(no name)";
             OS << S << "(" << To << ")\n";
           } else {
             OS << "(badref)\n";
@@ -229,6 +235,22 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
     printUpdates(I, PendUpdates.end());
   }
 
+  auto printCriticalEdges = [&](const CriticalEdge &E) {
+    auto FromName = E.FromBB->getName();
+    if (!E.FromBB->hasName())
+      FromName = "(no name)";
+    auto NewName = E.NewBB->getName();
+    if (!E.NewBB->hasName())
+      NewName = "(no name)";
+    auto ToName = E.ToBB->getName();
+    if (!E.ToBB->hasName())
+      ToName = "(no name)";
+    OS << "  " << FromName << ", " << NewName << ", " << ToName << '\n';
+  };
+  OS << "Critical edges to be split:\n";
+  for (const auto &E : CriticalEdgesToSplit)
+    printCriticalEdges(E);
+
   OS << "Pending DeletedBBs:\n";
   Index = 0;
   for (const auto *BB : DeletedBBs) {
@@ -237,7 +259,7 @@ GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::dump() const {
     if (BB->hasName())
       OS << BB->getName() << "(";
     else
-      OS << "(no_name)(";
+      OS << "(no name)(";
     OS << BB << ")\n";
   }
 #endif
@@ -348,6 +370,123 @@ void GenericDomTreeUpdater<DerivedT, DomTreeT,
   PendPDTUpdateIndex -= dropIndex;
 }
 
+template <typename DerivedT, typename DomTreeT, typename PostDomTreeT>
+void GenericDomTreeUpdater<DerivedT, DomTreeT, PostDomTreeT>::
+    applyUpdatesForCriticalEdgeSplitting(BasicBlockT *FromBB, BasicBlockT *ToBB,
+                                         BasicBlockT *NewBB) {
+  if (!DT && !PDT)
+    return;
+  CriticalEdgesToSplit.push_back({FromBB, ToBB, NewBB});
+  bool Inserted = NewBBs.insert(NewBB).second;
+  (void)Inserted;
+  assert(Inserted &&
+         "A basic block inserted via edge splitting cannot appear twice");
+  if (Strategy == UpdateStrategy::Lazy) {
+    applyDomTreeUpdates();
+    applyPostDomTreeUpdates();
+  }
+  if (Strategy == UpdateStrategy::Eager)
+    applySplitCriticalEdges();
+}
+
+template <typename DerivedT, typename DomTreeT, typename PostDomTreeT>
+void GenericDomTreeUpdater<DerivedT, DomTreeT,
+                           PostDomTreeT>::applySplitCriticalEdges() {
+  // Bail out early if there is nothing to do.
+  if (CriticalEdgesToSplit.empty())
+    return;
+
+  // For each element in CriticalEdgesToSplit, remember whether or not element
+  // is the new immediate domminator of its successor. The mapping is done by
+  // index, i.e., the information for the ith element of CriticalEdgesToSplit is
+  // the ith element of IsNewIDom.
+  SmallBitVector IsNewIDom(CriticalEdgesToSplit.size(), true);
+  SmallBitVector IsNewIPDom(CriticalEdgesToSplit.size(), true);
+  size_t Idx = 0;
+
+  // Collect all the dominance properties info, before invalidating
+  // the underlying DT.
+  for (CriticalEdge &Edge : CriticalEdgesToSplit) {
+    // Update dominator information.
+    if (DT) {
+      BasicBlockT *Succ = Edge.ToBB;
+      auto *SuccDTNode = DT->getNode(Succ);
+
+      for (BasicBlockT *PredBB : predecessors(Succ)) {
+        if (PredBB == Edge.NewBB)
+          continue;
+        // If we are in this situation:
+        // FromBB1        FromBB2
+        //    +              +
+        //   + +            + +
+        //  +   +          +   +
+        // ...  Split1  Split2 ...
+        //           +   +
+        //            + +
+        //             +
+        //            Succ
+        // Instead of checking the domiance property with Split2, we check it
+        // with FromBB2 since Split2 is still unknown of the underlying DT
+        // structure.
+        if (NewBBs.count(PredBB)) {
+          assert(pred_size(PredBB) == 1 && "A basic block resulting from a "
+                                           "critical edge split has more "
+                                           "than one predecessor!");
+          PredBB = *pred_begin(PredBB);
+        }
+        if (!DT->dominates(SuccDTNode, DT->getNode(PredBB))) {
+          IsNewIDom[Idx] = false;
+          break;
+        }
+      }
+    }
+
+    // Same as DT version but from another direction.
+    if (PDT) {
+      BasicBlockT *Pred = Edge.FromBB;
+      auto *PredDTNode = PDT->getNode(Pred);
+      for (BasicBlockT *SuccBB : successors(Pred)) {
+        if (SuccBB == Edge.NewBB)
+          continue;
+        if (NewBBs.count(SuccBB)) {
+          assert(succ_size(SuccBB) == 1 && "A basic block resulting from a "
+                                           "critical edge split has more "
+                                           "than one predecessor!");
+          SuccBB = *succ_begin(SuccBB);
+        }
+        if (!PDT->dominates(PredDTNode, PDT->getNode(SuccBB))) {
+          IsNewIPDom[Idx] = false;
+          break;
+        }
+      }
+    }
+    ++Idx;
+  }
+
+  // Now, update DT with the collected dominance properties info.
+  Idx = 0;
+  for (CriticalEdge &Edge : CriticalEdgesToSplit) {
+    if (DT) {
+      // We know FromBB dominates NewBB.
+      auto *NewDTNode = DT->addNewBlock(Edge.NewBB, Edge.FromBB);
+
+      // If all the other predecessors of "Succ" are dominated by "Succ" itself
+      // then the new block is the new immediate dominator of "Succ". Otherwise,
+      // the new block doesn't dominate anything.
+      if (IsNewIDom[Idx])
+        DT->changeImmediateDominator(DT->getNode(Edge.ToBB), NewDTNode);
+    }
+    if (PDT) {
+      auto *NewPDTNode = PDT->addNewBlock(Edge.NewBB, Edge.ToBB);
+      if (IsNewIPDom[Idx])
+        PDT->changeImmediateDominator(PDT->getNode(Edge.FromBB), NewPDTNode);
+    }
+    ++Idx;
+  }
+  NewBBs.clear();
+  CriticalEdgesToSplit.clear();
+}
+
 } // namespace llvm
 
 #endif // LLVM_ANALYSIS_GENERICDOMTREEUPDATERIMPL_H
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index b8153fd5d3fb7..ca47dbace8c0a 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -32,6 +32,7 @@
 namespace llvm {
 
 class BasicBlock;
+class MachineDomTreeUpdater;
 class MachineFunction;
 class MCSymbol;
 class ModuleSlotTracker;
@@ -970,15 +971,17 @@ class MachineBasicBlock
   /// MachineLoopInfo, as applicable.
   MachineBasicBlock *
   SplitCriticalEdge(MachineBasicBlock *Succ, Pass &P,
-                    std::vector<SparseBitVector<>> *LiveInSets = nullptr) {
-    return SplitCriticalEdge(Succ, &P, nullptr, LiveInSets);
+                    std::vector<SparseBitVector<>> *LiveInSets = nullptr,
+                    MachineDomTreeUpdater *MDTU = nullptr) {
+    return SplitCriticalEdge(Succ, &P, nullptr, LiveInSets, MDTU);
   }
 
   MachineBasicBlock *
   SplitCriticalEdge(MachineBasicBlock *Succ,
                     MachineFunctionAnalysisManager &MFAM,
-                    std::vector<SparseBitVector<>> *LiveInSets = nullptr) {
-    return SplitCriticalEdge(Succ, nullptr, &MFAM, LiveInSets);
+                    std::vector<SparseBitVector<>> *LiveInSets = nullptr,
+                    MachineDomTreeUpdater *MDTU = nullptr) {
+    return SplitCriticalEdge(Succ, nullptr, &MFAM, LiveInSets, MDTU);
   }
 
   /// Check if the edge between this block and the given successor \p
@@ -1256,10 +1259,9 @@ class MachineBasicBlock
   void removePredecessor(MachineBasicBlock *Pred);
 
   // Helper method for new pass manager migration.
-  MachineBasicBlock *
-  SplitCriticalEdge(MachineBasicBlock *Succ, Pass *P,
-                    MachineFunctionAnalysisManager *MFAM,
-                    std::vector<SparseBitVector<>> *LiveInSets);
+  MachineBasicBlock *SplitCriticalEdge(
+      MachineBasicBlock *Succ, Pass *P, MachineFunctionAnalysisManager *MFAM,
+      std::vector<SparseBitVector<>> *LiveInSets, MachineDomTreeUpdater *MDTU);
 };
 
 raw_ostream& operator<<(raw_ostream &OS, const MachineBasicBlock &MBB);
@@ -1341,6 +1343,12 @@ inline auto successors(const MachineBasicBlock *BB) { return BB->successors(); }
 inline auto predecessors(const MachineBasicBlock *BB) {
   return BB->predecessors();
 }
+inline auto succ_size(const MachineBasicBlock *BB) { return BB->succ_size(); }
+inline auto pred_size(const MachineBasicBlock *BB) { return BB->pred_size(); }
+inline auto succ_begin(const MachineBasicBlock *BB) { return BB->succ_begin(); }
+inline auto pred_begin(const MachineBasicBlock *BB) { return BB->pred_begin(); }
+inline auto succ_end(const MachineBasicBlock *BB) { return BB->succ_end(); }
+inline auto pred_end(const MachineBasicBlock *BB) { return BB->pred_end(); }
 
 /// MachineInstrSpan provides an interface to get an iteration range
 /// containing the instruction it was initialized with, along with all
diff --git a/llvm/include/llvm/CodeGen/MachineDominators.h b/llvm/include/llvm/CodeGen/MachineDominators.h
index 74cf94398736d..61635ff64502d 100644
--- a/llvm/include/llvm/CodeGen/MachineDominators.h
+++ b/llvm/include/llvm/CodeGen/MachineDominators.h
@@ -73,86 +73,22 @@ extern template bool Verify<MBBDomTree>(const MBBDomTree &DT,
 /// compute a normal dominator tree.
 ///
 class MachineDominatorTree : public DomTreeBase<MachineBasicBlock> {
-  /// Helper structure used to hold all the basic blocks
-  /// involved in the split of a critical edge.
-  struct CriticalEdge {
-    MachineBasicBlock *FromBB;
-    MachineBasicBlock *ToBB;
-    MachineBasicBlock *NewBB;
-  };
-
-  /// Pile up all the critical edges to be split.
-  /// The splitting of a critical edge is local and thus, it is possible
-  /// to apply several of those changes at the same time.
-  mutable SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
-
-  /// Remember all the basic blocks that are inserted during
-  /// edge splitting.
-  /// Invariant: NewBBs == all the basic blocks contained in the NewBB
-  /// field of all the elements of CriticalEdgesToSplit.
-  /// I.e., forall elt in CriticalEdgesToSplit, it exists BB in NewBBs
-  /// such as BB == elt.NewBB.
-  mutable SmallSet<MachineBasicBlock *, 32> NewBBs;
-
-  /// Apply all the recorded critical edges to the DT.
-  /// This updates the underlying DT information in a way that uses
-  /// the fast query path of DT as much as possible.
-  /// FIXME: This method should not be a const member!
-  ///
-  /// \post CriticalEdgesToSplit.empty().
-  void applySplitCriticalEdges() const;
 
 public:
   using Base = DomTreeBase<MachineBasicBlock>;
 
   MachineDominatorTree() = default;
-  explicit MachineDominatorTree(MachineFunction &MF) { calculate(MF); }
+  explicit MachineDominatorTree(MachineFunction &MF) { recalculate(MF); }
 
   /// Handle invalidation explicitly.
   bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
                   MachineFunctionAnalysisManager::Invalidator &);
 
-  // FIXME: If there is an updater for MachineDominatorTree,
-  // migrate to this updater and remove these wrappers.
-
-  MachineDominatorTree &getBase() {
-    applySplitCriticalEdges();
-    return *this;
-  }
-
-  MachineBasicBlock *getRoot() const {
-    applySplitCriticalEdges();
-    return Base::getRoot();
-  }
-
-  MachineDomTreeNode *getRootNode() const {
-    applySplitCriticalEdges();
-    return const_cast<MachineDomTreeNode *>(Base::getRootNode());
-  }
-
-  void calculate(MachineFunction &F);
-
-  bool dominates(const MachineDomTreeNode *A,
-                 const MachineDomTreeNode *B) const {
-    applySplitCriticalEdges();
-    return Base::dominates(A, B);
-  }
-
-  void getDescendants(MachineBasicBlock *A,
-                      SmallVectorImpl<MachineBasicBlock *> &Result) {
-    applySplitCriticalEdges();
-    Base::getDescendants(A, Result);
-  }
-
-  bool dominates(const MachineBasicBlock *A, const MachineBasicBlock *B) const {
-    applySplitCriticalEdges();
-    return Base::dominates(A, B);
-  }
+  using Base::dominates;
 
   // dominates - Return true if A dominates B. This performs the
   // special checks necessary if A and B are in the same basic block.
   bool dominates(const MachineInstr *A, const MachineInstr *B) const {
-    applySplitCriticalEdges();
     const MachineBasicBlock *BBA = A->getParent(), *BBB = B->getParent();
     if (BBA != BBB)
       return Base::dominates(BBA, BBB);
@@ -164,107 +100,6 @@ class MachineDominatorTree : public DomTreeBase<MachineBasicBlock> {
 
     return &*I == A;
   }
-
-  bool properlyDominates(const MachineDomTreeNode *A,
-                         const MachineDomTreeNode *B) const {
-    applySplitCriticalEdges();
-    return Base::properlyDominates(A, B);
-  }
-
-  bool properlyDominates(const MachineBasicBlock *A,
-                         const MachineBasicBlock *B) const {
-    applySplitCriticalEdges();
-    return Base::properlyDominates(A, B);
-  }
-
-  /// findNearestCommonDominator - Find nearest common dominator basic block
-  /// for basic block A and B. If there is no such block then return NULL.
-  MachineBasicBlock *findNearestCommonDominator(MachineBasicBlock *A,
-                                                MachineBasicBlock *B) {
-    applySplitCriticalEdges();
-    return Base::findNearestCommonDominator(A, B);
-  }
-
-  MachineDomTreeNode *operator[](MachineBasicBlock *BB) const {
-    applySplitCriticalEdges();
-    return Base::getNode(BB);
-  }
-
-  /// getNode - return the (Post)DominatorTree node for the specified basic
-  /// block.  This is the same as using operator[] on this class.
-  ///
-  MachineDomTreeNode *getNode(MachineBasicBlock *BB) const {
-    applySplitCriticalEdges();
-    return Base::getNode(BB);
-  }
-
-  /// addNewBlock - Add a new node to the dominator tree information.  This
-  /// creates a new node as a child of DomBB dominator node,linking it into
-  /// the children list of the immediate dominator.
-  MachineDomTreeNode *addNewBlock(MachineBasicBlock *BB,
-                                  MachineBasicBlock *DomBB) {
-    applySplitCriticalEdges();
-    return Base::addNewBlock(BB, DomBB);
-  }
-
-  /// changeImmediateDominator - This method is used to update the dominator
-  /// tree information when a node's immediate dominator changes.
-  ///
-  void changeImmediateDominator(MachineBasicBlock *N,
-                                MachineBasicBlock *NewIDom) {
-    applySplitCriticalEdges();
-    Base::changeImmediateDominator(N, NewIDom);
-  }
-
-  void changeImmediateDominator(MachineDomTreeNode *N,
-                                MachineDomTreeNode *NewIDom) {
-    applySplitCriticalEdges();
-    Base::changeImmediateDominator(N, NewIDom);
-  }
-
-  /// eraseNode - Removes a node from  the dominator tree. Block must not
-  /// dominate any other blocks. Re...
[truncated]

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you wait a few days for me to review this?

Unfortunately the performance of generic lazy updater is not so efficient for critical edge splitting (about +0.2%).

I remember this was the original reason for us leaving the 'manual' update mechanism around

@kuhar kuhar self-requested a review July 29, 2024 14:31
@paperchalice
Copy link
Contributor Author

Could you wait a few days for me to review this?

Sure. :)

Comment on lines +155 to +158
/// 3. This kind updates are incompatible with critical edge splitting
/// updates, call this method will apply all critical edge updates in
/// lazy mode. It is not recommended to interleave applyUpdates and
/// applyUpdatesForCriticalEdgeSplitting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't parse this sentence. Could you explain it in a bit more detail?

Specifically, it is not clear to me how to reconcile 'incompatible' with 'not recommended'. What happens if we do add edge splitting updates in eager and lazy mode?

Copy link
Contributor Author

@paperchalice paperchalice Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my poor expression.
In lazy mode

DomTreeUpdater DTU(...); // Assume in lazy mode.
DTU.applyUpdates({...}); // Several general updates.
DTU.applyUpdatesForCriticalEdgeSplitting(...); // Is equivalent to DTU.flush(); DTU.applyUpdatesForCriticalEdgeSplitting(...);
DTU.applyUpdatesForCriticalEdgeSplitting(...); // DTU will record this kind update until flush() or applyUpdates() is called.

Because cfg::Update doesn't model the update kind for critical edge splitting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about exposing a more general update type that captures both edge additions/deletions and critical edge splitting? Would that make sense? I'm trying to see if we can avoid adding new methods and, in turn, having to document intricacies like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From user perspective a unified update method is better, i.e. applyUpdates({{From1, To1, UpdateKind::Insert}, {From, To, New, UpdateKind::Split}...}), then cfg::Update<NodePtrT> is not enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then cfg::Update is not enough.

Could you explain this part? I'm not sure which API we are talking about and what the limitations are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all update operations are based on cfg::Update. If we extend cfg::Update directly e.g. add a new kind UpdateKind::Split, then it is duplicated with two insertions and deletion, and causes some compiler warnings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could update cfg::Update though and add one more node ref there. I still don't understand what the issue is here:

then it is duplicated with two insertions and deletion, and causes some compiler warnings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A critical splitting update can be expressed as insert two new edges and delete old critical edge.
Current CFG codes rely on the dichotomy of UpdateKind, a new update kind makes old match not exhaustive.
Some CFG related codes need to be updated, e.g. LegalizeUpdates which is not easy to update, because it assumes cfg::Update only has two pointer fields...

Comment on lines +280 to +283
/// Pile up all the critical edges to be split.
/// The splitting of a critical edge is local and thus, it is possible
/// to apply several of those changes at the same time.
SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is local, what's the benefit of enqueuing critical edge updates vs. performing them eagerly?

Comment on lines +280 to +283
/// Pile up all the critical edges to be split.
/// The splitting of a critical edge is local and thus, it is possible
/// to apply several of those changes at the same time.
SmallVector<CriticalEdge, 32> CriticalEdgesToSplit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you pick the small size of 32?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from abea99f.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to compile a few representative programs (e.g., clang) and print the maximum and average size? Otherwise I'd stick to the default small size.

Copy link
Contributor Author

@paperchalice paperchalice Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with bzip2:

maximum: 92
average: 8.70909

Most will not exceed 7.

Copy link
Member

@kuhar kuhar Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the due diligence! Then I'd stick to the default value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants