[SimplifyCFG] More accurate use legality check for sinking #94462

nikic · 2024-06-05T12:00:56Z

When sinking instructions, we have to make sure that the uses of that instruction are consistent: If used in a phi node in the sink target, then the phi operands have to match the sink candidates. This allows the phi to be removed when the instruction is sunk. This case is already handled accurately.

However, what the current code doesn't handle are uses in the same block. These are just unconditionally accepted, even though this needs the same consistency check for the phi node that sinking the using instruction would introduce.

Instead, the code has another check when actually performing the sinking, which repeats the phi check (just at a later time, where all the later instructions have already been sunk and any new phis introduced).

This is problematic, because it messes up the profitability heuristic. The code will think that certain instructions will get sunk, but they actually won't. This may result in more phi nodes being created than is considered profitable. See the changed test for a case where we no longer do this after this patch.

The new approach makes sure that the uses are consistent during the initial legality check. This is based on PhiOperands, which we already collect.

The primary motivation for this is to generalize sinking to support more than one use, and doing that generalization is hard with the current split checking approach.

llvmbot · 2024-06-05T12:01:28Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

When sinking instructions, we have to make sure that the uses of that instruction are consistent: If used in a phi node in the sink target, then the phi operands have to match the sink candidates. This allows the phi to be removed when the instruction is sunk. This case is already handled accurately.

However, what the current code doesn't handle are uses in the same block. These are just unconditionally accepted, even though this needs the same consistency check for the phi node that sinking the using instruction would introduce.

Instead, the code has another check when actually performing the sinking, which repeats the phi check (just at a later time, where all the later instructions have already been sunk and any new phis introduced).

This is problematic, because it messes up the profitability heuristic. The code will think that certain instructions will get sunk, but they actually won't. This may result in more phi nodes being created than is considered profitable. See the changed test for a case where we no longer do this after this patch.

The new approach makes sure that the uses are consistent during the initial legality check. This is based on PhiOperands, which we already collect.

The primary motivation for this is to generalize sinking to support more than one use, and doing that generalization is hard with the current split checking approach.

Full diff: https://github.com/llvm/llvm-project/pull/94462.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+37-50)
(modified) llvm/test/Transforms/SimplifyCFG/X86/sink-common-code.ll (+4-4)

diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index fe6ec8819ff99..b7f60928e81d7 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -1930,7 +1930,7 @@ static bool replacingOperandWithVariableIsCheap(const Instruction *I,
 // PHI node (because an operand varies in each input block), add to PHIOperands.
 static bool canSinkInstructions(
     ArrayRef<Instruction *> Insts,
-    DenseMap<Instruction *, SmallVector<Value *, 4>> &PHIOperands) {
+    DenseMap<const Use *, SmallVector<Value *, 4>> &PHIOperands) {
   // Prune out obviously bad instructions to move. Each instruction must have
   // exactly zero or one use, and we check later that use is by a single, common
   // PHI instruction in the successor.
@@ -1981,21 +1981,19 @@ static bool canSinkInstructions(
       return false;
   }
 
-  // All instructions in Insts are known to be the same opcode. If they have a
-  // use, check that the only user is a PHI or in the same block as the
-  // instruction, because if a user is in the same block as an instruction we're
-  // contemplating sinking, it must already be determined to be sinkable.
+  // Uses must be consistent: If I0 is used in a phi node in the sink target,
+  // then the other phi operands must match the instructions from Insts. This
+  // also has to hold true for any phi nodes that would be created as a result
+  // of sinking. Both of these cases are represented by PhiOperands.
   if (HasUse) {
-    auto *PNUse = dyn_cast<PHINode>(*I0->user_begin());
-    auto *Succ = I0->getParent()->getTerminator()->getSuccessor(0);
-    if (!all_of(Insts, [&PNUse,&Succ](const Instruction *I) -> bool {
-          auto *U = cast<Instruction>(*I->user_begin());
-          return (PNUse &&
-                  PNUse->getParent() == Succ &&
-                  PNUse->getIncomingValueForBlock(I->getParent()) == I) ||
-                 U->getParent() == I->getParent();
-        }))
+    const Use &U = *I0->use_begin();
+    auto It = PHIOperands.find(&U);
+    if (It == PHIOperands.end())
+      // There may be uses in other blocks when sinking into a loop header.
       return false;
+    for (auto [I1, I2] : zip(Insts, It->second))
+      if (I1 != I2)
+        return false;
   }
 
   // Because SROA can't handle speculating stores of selects, try not to sink
@@ -2061,8 +2059,9 @@ static bool canSinkInstructions(
           !canReplaceOperandWithVariable(I0, OI))
         // We can't create a PHI from this GEP.
         return false;
+      auto &Ops = PHIOperands[&I0->getOperandUse(OI)];
       for (auto *I : Insts)
-        PHIOperands[I].push_back(I->getOperand(OI));
+        Ops.push_back(I->getOperand(OI));
     }
   }
   return true;
@@ -2071,7 +2070,7 @@ static bool canSinkInstructions(
 // Assuming canSinkInstructions(Blocks) has returned true, sink the last
 // instruction of every block in Blocks to their common successor, commoning
 // into one instruction.
-static bool sinkLastInstruction(ArrayRef<BasicBlock*> Blocks) {
+static void sinkLastInstruction(ArrayRef<BasicBlock*> Blocks) {
   auto *BBEnd = Blocks[0]->getTerminator()->getSuccessor(0);
 
   // canSinkInstructions returning true guarantees that every block has at
@@ -2086,23 +2085,10 @@ static bool sinkLastInstruction(ArrayRef<BasicBlock*> Blocks) {
       Insts.push_back(I);
   }
 
-  // The only checking we need to do now is that all users of all instructions
-  // are the same PHI node. canSinkInstructions should have checked this but
-  // it is slightly over-aggressive - it gets confused by commutative
-  // instructions so double-check it here.
-  Instruction *I0 = Insts.front();
-  if (!I0->user_empty()) {
-    auto *PNUse = dyn_cast<PHINode>(*I0->user_begin());
-    if (!all_of(Insts, [&PNUse](const Instruction *I) -> bool {
-          auto *U = cast<Instruction>(*I->user_begin());
-          return U == PNUse;
-        }))
-      return false;
-  }
-
   // We don't need to do any more checking here; canSinkInstructions should
   // have done it all for us.
   SmallVector<Value*, 4> NewOperands;
+  Instruction *I0 = Insts.front();
   for (unsigned O = 0, E = I0->getNumOperands(); O != E; ++O) {
     // This check is different to that in canSinkInstructions. There, we
     // cared about the global view once simplifycfg (and instcombine) have
@@ -2170,8 +2156,6 @@ static bool sinkLastInstruction(ArrayRef<BasicBlock*> Blocks) {
     I->replaceAllUsesWith(I0);
     I->eraseFromParent();
   }
-
-  return true;
 }
 
 namespace {
@@ -2312,9 +2296,19 @@ static bool SinkCommonCodeFromPredecessors(BasicBlock *BB,
   // carry on. If we can sink an instruction but need to PHI-merge some operands
   // (because they're not identical in each instruction) we add these to
   // PHIOperands.
+  // We prepopulate PHIOperands with the phis that already exist in BB.
+  DenseMap<const Use *, SmallVector<Value *, 4>> PHIOperands;
+  for (PHINode &PN : BB->phis()) {
+    SmallDenseMap<BasicBlock *, const Use *, 4> IncomingVals;
+    for (const Use &U : PN.incoming_values())
+      IncomingVals.insert({PN.getIncomingBlock(U), &U});
+    auto &Ops = PHIOperands[IncomingVals[UnconditionalPreds[0]]];
+    for (BasicBlock *Pred : UnconditionalPreds)
+      Ops.push_back(*IncomingVals[Pred]);
+  }
+
   int ScanIdx = 0;
   SmallPtrSet<Value*,4> InstructionsToSink;
-  DenseMap<Instruction*, SmallVector<Value*,4>> PHIOperands;
   LockstepReverseIterator LRI(UnconditionalPreds);
   while (LRI.isValid() &&
          canSinkInstructions(*LRI, PHIOperands)) {
@@ -2336,20 +2330,19 @@ static bool SinkCommonCodeFromPredecessors(BasicBlock *BB,
     // actually sink before encountering instruction that is unprofitable to
     // sink?
     auto ProfitableToSinkInstruction = [&](LockstepReverseIterator &LRI) {
-      unsigned NumPHIdValues = 0;
-      for (auto *I : *LRI)
-        for (auto *V : PHIOperands[I]) {
-          if (!InstructionsToSink.contains(V))
-            ++NumPHIdValues;
+      unsigned NumPHIInsts = 0;
+      for (Use &U : (*LRI)[0]->operands()) {
+        auto It = PHIOperands.find(&U);
+        if (It != PHIOperands.end() && !all_of(It->second, [&](Value *V) {
+              return InstructionsToSink.contains(V);
+            })) {
+          ++NumPHIInsts;
           // FIXME: this check is overly optimistic. We may end up not sinking
           // said instruction, due to the very same profitability check.
           // See @creating_too_many_phis in sink-common-code.ll.
         }
-      LLVM_DEBUG(dbgs() << "SINK: #phid values: " << NumPHIdValues << "\n");
-      unsigned NumPHIInsts = NumPHIdValues / UnconditionalPreds.size();
-      if ((NumPHIdValues % UnconditionalPreds.size()) != 0)
-        NumPHIInsts++;
-
+      }
+      LLVM_DEBUG(dbgs() << "SINK: #phi insts: " << NumPHIInsts << "\n");
       return NumPHIInsts <= 1;
     };
 
@@ -2474,13 +2467,7 @@ static bool SinkCommonCodeFromPredecessors(BasicBlock *BB,
     // sink is always at index 0.
     LRI.reset();
 
-    if (!sinkLastInstruction(UnconditionalPreds)) {
-      LLVM_DEBUG(
-          dbgs()
-          << "SINK: stopping here, failed to actually sink instruction!\n");
-      break;
-    }
-
+    sinkLastInstruction(UnconditionalPreds);
     NumSinkCommonInstrs++;
     Changed = true;
   }
diff --git a/llvm/test/Transforms/SimplifyCFG/X86/sink-common-code.ll b/llvm/test/Transforms/SimplifyCFG/X86/sink-common-code.ll
index 118372164c1f9..b67ee63036848 100644
--- a/llvm/test/Transforms/SimplifyCFG/X86/sink-common-code.ll
+++ b/llvm/test/Transforms/SimplifyCFG/X86/sink-common-code.ll
@@ -568,16 +568,16 @@ define zeroext i1 @test_crash(i1 zeroext %flag, ptr %i4, ptr %m, ptr %n) {
 ; CHECK-NEXT:    br i1 [[FLAG:%.*]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
 ; CHECK:       if.then:
 ; CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[I4:%.*]], align 4
+; CHECK-NEXT:    [[TMP2:%.*]] = add i32 [[TMP1]], -1
 ; CHECK-NEXT:    br label [[IF_END:%.*]]
 ; CHECK:       if.else:
 ; CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[M:%.*]], align 4
 ; CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[N:%.*]], align 4
+; CHECK-NEXT:    [[TMP5:%.*]] = add i32 [[TMP3]], [[TMP4]]
 ; CHECK-NEXT:    br label [[IF_END]]
 ; CHECK:       if.end:
-; CHECK-NEXT:    [[TMP4_SINK:%.*]] = phi i32 [ [[TMP4]], [[IF_ELSE]] ], [ -1, [[IF_THEN]] ]
-; CHECK-NEXT:    [[TMP3_SINK:%.*]] = phi i32 [ [[TMP3]], [[IF_ELSE]] ], [ [[TMP1]], [[IF_THEN]] ]
-; CHECK-NEXT:    [[TMP5:%.*]] = add i32 [[TMP3_SINK]], [[TMP4_SINK]]
-; CHECK-NEXT:    store i32 [[TMP5]], ptr [[I4]], align 4
+; CHECK-NEXT:    [[TMP5_SINK:%.*]] = phi i32 [ [[TMP5]], [[IF_ELSE]] ], [ [[TMP2]], [[IF_THEN]] ]
+; CHECK-NEXT:    store i32 [[TMP5_SINK]], ptr [[I4]], align 4
 ; CHECK-NEXT:    ret i1 true
 ;
 entry:

github-actions · 2024-06-05T12:04:31Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff db08b0999d1b42391e34ac0c469b92de98e9f550 62ec82bfb14449c13338b39ffb3bade540236cc3 -- llvm/lib/Transforms/Utils/SimplifyCFG.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 9c7f90b061..ac9d454fd6 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -2071,7 +2071,7 @@ static bool canSinkInstructions(
 // Assuming canSinkInstructions(Blocks) has returned true, sink the last
 // instruction of every block in Blocks to their common successor, commoning
 // into one instruction.
-static void sinkLastInstruction(ArrayRef<BasicBlock*> Blocks) {
+static void sinkLastInstruction(ArrayRef<BasicBlock *> Blocks) {
   auto *BBEnd = Blocks[0]->getTerminator()->getSuccessor(0);
 
   // canSinkInstructions returning true guarantees that every block has at
@@ -2292,8 +2292,8 @@ static bool SinkCommonCodeFromPredecessors(BasicBlock *BB,
     return false;
 
   // We take a two-step approach to tail sinking. First we scan from the end of
-  // each block upwards in lockstep. If the n'th instruction from the end of each
-  // block can be sunk, those instructions are added to ValuesToSink and we
+  // each block upwards in lockstep. If the n'th instruction from the end of
+  // each block can be sunk, those instructions are added to ValuesToSink and we
   // carry on. If we can sink an instruction but need to PHI-merge some operands
   // (because they're not identical in each instruction) we add these to
   // PHIOperands.
@@ -2309,7 +2309,7 @@ static bool SinkCommonCodeFromPredecessors(BasicBlock *BB,
   }
 
   int ScanIdx = 0;
-  SmallPtrSet<Value*,4> InstructionsToSink;
+  SmallPtrSet<Value *, 4> InstructionsToSink;
   LockstepReverseIterator LRI(UnconditionalPreds);
   while (LRI.isValid() &&
          canSinkInstructions(*LRI, PHIOperands)) {

dtcxzyw · 2024-06-05T16:33:38Z

I find that this patch causes some functions failed to be inlined. Can you provide some performance data?

dtcxzyw · 2024-06-05T18:00:43Z

I find that this patch causes some functions failed to be inlined. Can you provide some performance data?

Alright, this patch improves both code size and performance on llvm-test-suite.

PR Link: llvm/llvm-project#94462

nikic · 2024-06-13T06:25:20Z

ping

efriedma-quic

LGTM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp

When sinking instructions, we have to make sure that the uses of that instruction are consistent: If used in a phi node in the sink target, then the phi operands have to match the sink candidates. This allows the phi to be removed when the instruction is sunk. This case is already handled accurately. However, what the current code doesn't handle are uses in the same block. These are just unconditionally accepted, even though this needs the same consistency check for the phi node that sinking the using instruction would introduce. Instead, the code has another check when actually performing the sinking, which repeats the phi check (just at a later time, where all the later instructions have already been sunk and any new phis introduced). This is problematic, because it messes up the profitability heuristic. The code will think that certain instructions will get sunk, but they actually won't. This may result in more phi nodes being created than is considered profitable. See the changed test for a case where we no longer do this after this patch. The new approach makes sure that the uses are consistent during the initial legality check. This is based on PhiOperands, which we already collect. The primary motivation for this is to generalize sinking to support more than one use, and doing that generalization is hard with the current split checking approach.

Sinking currently only supports instructions that have zero or one uses. Extend this to handle instructions with any number of uses, as long as all uses are consistent (i.e. the "same" for all sinking candidates). After llvm#94462 this is basically just a matter of looping over all uses instead of checking the first one only.

…#95521) Sinking currently only supports instructions that have zero or one uses. Extend this to handle instructions with any number of uses, as long as all uses are consistent (i.e. the "same" for all sinking candidates). After #94462 this is basically just a matter of looping over all uses instead of checking the first one only.

When sinking instructions, we have to make sure that the uses of that instruction are consistent: If used in a phi node in the sink target, then the phi operands have to match the sink candidates. This allows the phi to be removed when the instruction is sunk. This case is already handled accurately. However, what the current code doesn't handle are uses in the same block. These are just unconditionally accepted, even though this needs the same consistency check for the phi node that sinking the using instruction would introduce. Instead, the code has another check when actually performing the sinking, which repeats the phi check (just at a later time, where all the later instructions have already been sunk and any new phis introduced). This is problematic, because it messes up the profitability heuristic. The code will think that certain instructions will get sunk, but they actually won't. This may result in more phi nodes being created than is considered profitable. See the changed test for a case where we no longer do this after this patch. The new approach makes sure that the uses are consistent during the initial legality check. This is based on PhiOperands, which we already collect. The primary motivation for this is to generalize sinking to support more than one use, and doing that generalization is hard with the current split checking approach.

nikic requested review from fhahn, dtcxzyw and efriedma-quic June 5, 2024 12:00

llvmbot added the llvm:transforms label Jun 5, 2024

nikic mentioned this pull request Jun 5, 2024

[AMDGPU]Add+GEP->GEP commit performance degradation on AMDGPU #94092

Open

dtcxzyw mentioned this pull request Jun 5, 2024

Test PR94462 plctlab/llvm-ci#1149

Closed

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jun 5, 2024

pre-commit: test PR94462

70c2da7

PR Link: llvm/llvm-project#94462

dtcxzyw mentioned this pull request Jun 5, 2024

pre-commit: test PR94462 dtcxzyw/llvm-opt-benchmark#663

Closed

efriedma-quic requested a review from huihzhang June 5, 2024 21:43

efriedma-quic approved these changes Jun 13, 2024

View reviewed changes

llvm/lib/Transforms/Utils/SimplifyCFG.cpp Outdated Show resolved Hide resolved

nikic added 2 commits June 14, 2024 09:30

Use equal() instead of loop over zip()

62ec82b

nikic force-pushed the simplifycfg-sink-cleanup branch from 8249669 to 62ec82b Compare June 14, 2024 07:44

nikic merged commit dfde077 into llvm:main Jun 14, 2024
4 of 6 checks passed

nikic deleted the simplifycfg-sink-cleanup branch June 14, 2024 08:38

nikic mentioned this pull request Jun 14, 2024

[SimplifyCFG] Add support for sinking instructions with multiple uses #95521

Merged

jpienaar mentioned this pull request Jun 17, 2024

[mlirc] Add missing extern C #95829

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SimplifyCFG] More accurate use legality check for sinking #94462

[SimplifyCFG] More accurate use legality check for sinking #94462

Uh oh!

nikic commented Jun 5, 2024

Uh oh!

llvmbot commented Jun 5, 2024

Uh oh!

github-actions bot commented Jun 5, 2024 •

edited

Loading

Uh oh!

dtcxzyw commented Jun 5, 2024

Uh oh!

dtcxzyw commented Jun 5, 2024

Uh oh!

nikic commented Jun 13, 2024

Uh oh!

efriedma-quic left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[SimplifyCFG] More accurate use legality check for sinking #94462

[SimplifyCFG] More accurate use legality check for sinking #94462

Uh oh!

Conversation

nikic commented Jun 5, 2024

Uh oh!

llvmbot commented Jun 5, 2024

Uh oh!

github-actions bot commented Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw commented Jun 5, 2024

Uh oh!

dtcxzyw commented Jun 5, 2024

Uh oh!

nikic commented Jun 13, 2024

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 5, 2024 •

edited

Loading