Skip to content

[DFAJumpThreading] Handle select unfolding when user phi is not a dir… #109511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 24, 2024

Conversation

UsmanNadeem
Copy link
Contributor

…ect successor

Previously the code assumed that the select instruction is defined in a block that is a direct predecessor of the block where the PHINode uses it. So, we were hitting an assertion when we tried to access the def block as an incoming block for the user phi node.

This patch handles that case by using the correct end block and creating a new phi node that aggregates both the values of the select in that end block, and then using that new unfolded phi to overwrite the original user phi node.

Fixes #106083

Change-Id: Ie471994cca232318f74a6e6438efa21e561c2dc0

@llvmbot
Copy link
Member

llvmbot commented Sep 21, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Usman Nadeem (UsmanNadeem)

Changes

…ect successor

Previously the code assumed that the select instruction is defined in a block that is a direct predecessor of the block where the PHINode uses it. So, we were hitting an assertion when we tried to access the def block as an incoming block for the user phi node.

This patch handles that case by using the correct end block and creating a new phi node that aggregates both the values of the select in that end block, and then using that new unfolded phi to overwrite the original user phi node.

Fixes #106083

Change-Id: Ie471994cca232318f74a6e6438efa21e561c2dc0


Full diff: https://github.com/llvm/llvm-project/pull/109511.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp (+28-16)
  • (modified) llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll (+234)
diff --git a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
index ef9c264482a640..c80753baf29081 100644
--- a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
+++ b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
@@ -194,7 +194,6 @@ void unfold(DomTreeUpdater *DTU, LoopInfo *LI, SelectInstToUnfold SIToUnfold,
   SelectInst *SI = SIToUnfold.getInst();
   PHINode *SIUse = SIToUnfold.getUse();
   BasicBlock *StartBlock = SI->getParent();
-  BasicBlock *EndBlock = SIUse->getParent();
   BranchInst *StartBlockTerm =
       dyn_cast<BranchInst>(StartBlock->getTerminator());
 
@@ -202,6 +201,7 @@ void unfold(DomTreeUpdater *DTU, LoopInfo *LI, SelectInstToUnfold SIToUnfold,
   assert(SI->hasOneUse());
 
   if (StartBlockTerm->isUnconditional()) {
+    BasicBlock *EndBlock = StartBlock->getUniqueSuccessor();
     // Arbitrarily choose the 'false' side for a new input value to the PHI.
     BasicBlock *NewBlock = BasicBlock::Create(
         SI->getContext(), Twine(SI->getName(), ".si.unfold.false"),
@@ -223,32 +223,44 @@ void unfold(DomTreeUpdater *DTU, LoopInfo *LI, SelectInstToUnfold SIToUnfold,
                                       NewBlock->getFirstInsertionPt());
     NewPhi->addIncoming(SIOp2, StartBlock);
 
-    if (auto *OpSi = dyn_cast<SelectInst>(SIOp1))
-      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, SIUse));
-    if (auto *OpSi = dyn_cast<SelectInst>(SIOp2))
-      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, NewPhi));
+    // Update any other PHI nodes in EndBlock.
+    for (PHINode &Phi : EndBlock->phis()) {
+      if (SIUse == &Phi)
+        continue;
+      Phi.addIncoming(Phi.getIncomingValueForBlock(StartBlock), NewBlock);
+    }
 
     // Update the phi node of SI.
-    for (unsigned Idx = 0; Idx < SIUse->getNumIncomingValues(); ++Idx) {
-      if (SIUse->getIncomingBlock(Idx) == StartBlock)
-        SIUse->setIncomingValue(Idx, SIOp1);
-    }
-    SIUse->addIncoming(NewPhi, NewBlock);
+    if (EndBlock == SIUse->getParent()) {
+      SIUse->setIncomingValueForBlock(StartBlock, SIOp1);
+      SIUse->addIncoming(NewPhi, NewBlock);
+    } else {
+      PHINode *EndPhi = PHINode::Create(SIUse->getType(), pred_size(EndBlock),
+                                        Twine(SI->getName(), ".si.unfold.phi"),
+                                        EndBlock->getFirstInsertionPt());
+      for (BasicBlock *Pred : predecessors(EndBlock)) {
+        if (Pred != StartBlock && Pred != NewBlock)
+          EndPhi->addIncoming(EndPhi, Pred);
+      }
 
-    // Update any other PHI nodes in EndBlock.
-    for (auto II = EndBlock->begin(); PHINode *Phi = dyn_cast<PHINode>(II);
-         ++II) {
-      if (Phi != SIUse)
-        Phi->addIncoming(Phi->getIncomingValueForBlock(StartBlock), NewBlock);
+      EndPhi->addIncoming(SIOp1, StartBlock);
+      EndPhi->addIncoming(NewPhi, NewBlock);
+      SIUse->replaceUsesOfWith(SI, EndPhi);
+      SIUse = EndPhi;
     }
 
-    StartBlockTerm->eraseFromParent();
+    if (auto *OpSi = dyn_cast<SelectInst>(SIOp1))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, SIUse));
+    if (auto *OpSi = dyn_cast<SelectInst>(SIOp2))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, NewPhi));
 
     // Insert the real conditional branch based on the original condition.
+    StartBlockTerm->eraseFromParent();
     BranchInst::Create(EndBlock, NewBlock, SI->getCondition(), StartBlock);
     DTU->applyUpdates({{DominatorTree::Insert, StartBlock, EndBlock},
                        {DominatorTree::Insert, StartBlock, NewBlock}});
   } else {
+    BasicBlock *EndBlock = SIUse->getParent();
     BasicBlock *NewBlockT = BasicBlock::Create(
         SI->getContext(), Twine(SI->getName(), ".si.unfold.true"),
         EndBlock->getParent(), EndBlock);
diff --git a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
index c38f81d0f046ef..b48e10e8d2a96c 100644
--- a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
@@ -300,3 +300,237 @@ define void @self-reference() {
 end:
   ret void
 }
+
+@a = external dso_local global ptr, align 8
+@b = external dso_local global i32, align 4
+@c = external dso_local global i64, align 8
+@d = external dso_local global i16, align 2
+@e = external dso_local global i64, align 8
+@f = external dso_local global i32, align 4
+@g = external dso_local global i64, align 8
+@h = external dso_local global i32, align 4
+@i = external dso_local global ptr, align 8
+
+define void @pr106083_invalidBBarg_fold() {
+; CHECK-LABEL: @pr106083_invalidBBarg_fold(
+; CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr @h, align 4
+; CHECK-NEXT:    [[DOTNOT2:%.*]] = icmp eq i32 [[TMP1]], 0
+; CHECK-NEXT:    [[D_PROMOTED3:%.*]] = load i16, ptr @d, align 1
+; CHECK-NEXT:    br i1 [[DOTNOT2]], label [[BB0:%.*]], label [[DOT_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       BB0.loopexit:
+; CHECK-NEXT:    [[D_PROMOTED41:%.*]] = phi i16 [ [[D_PROMOTED4_JT2:%.*]], [[BB7_JT2:%.*]] ], [ [[D_PROMOTED4:%.*]], [[BB7:%.*]] ]
+; CHECK-NEXT:    br label [[BB0]]
+; CHECK:       ..si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI:%.*]] = phi i32 [ 6, [[TMP0:%.*]] ]
+; CHECK-NEXT:    br label [[BB0]]
+; CHECK:       BB0:
+; CHECK-NEXT:    [[D_PROMOTED6:%.*]] = phi i16 [ [[D_PROMOTED3]], [[TMP0]] ], [ [[D_PROMOTED41]], [[BB0_LOOPEXIT:%.*]] ], [ [[D_PROMOTED3]], [[DOT_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[DOT_SI_UNFOLD_PHI:%.*]] = phi i32 [ [[DOT_SI_UNFOLD_PHI]], [[BB0_LOOPEXIT]] ], [ 0, [[TMP0]] ], [ [[DOTSI_UNFOLD_PHI]], [[DOT_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    br label [[BB1_JT2:%.*]]
+; CHECK:       BB1:
+; CHECK-NEXT:    [[D_PROMOTED5:%.*]] = phi i16 [ [[D_PROMOTED4]], [[BB1_BACKEDGE:%.*]] ]
+; CHECK-NEXT:    [[TMP2:%.*]] = phi i16 [ [[TMP10:%.*]], [[BB1_BACKEDGE]] ]
+; CHECK-NEXT:    [[DOT1:%.*]] = phi i32 [ [[DOT3:%.*]], [[BB1_BACKEDGE]] ]
+; CHECK-NEXT:    [[TMP3:%.*]] = load volatile i32, ptr @f, align 4
+; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp eq i32 [[TMP3]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT]], label [[BB7]], label [[BB2:%.*]]
+; CHECK:       BB1.jt2:
+; CHECK-NEXT:    [[D_PROMOTED5_JT2:%.*]] = phi i16 [ [[D_PROMOTED6]], [[BB0]] ]
+; CHECK-NEXT:    [[TMP4:%.*]] = phi i16 [ [[D_PROMOTED6]], [[BB0]] ]
+; CHECK-NEXT:    [[DOT1_JT2:%.*]] = phi i32 [ 2, [[BB0]] ]
+; CHECK-NEXT:    [[TMP5:%.*]] = load volatile i32, ptr @f, align 4
+; CHECK-NEXT:    [[DOTNOT_JT2:%.*]] = icmp eq i32 [[TMP5]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT_JT2]], label [[BB7_JT2]], label [[BB2_JT2:%.*]]
+; CHECK:       BB2:
+; CHECK-NEXT:    [[TMP6:%.*]] = add i16 [[TMP2]], 1
+; CHECK-NEXT:    store i16 [[TMP6]], ptr @d, align 2
+; CHECK-NEXT:    [[TMP7:%.*]] = load volatile i64, ptr @g, align 8
+; CHECK-NEXT:    [[DOTNOT1:%.*]] = icmp eq i64 [[TMP7]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT1]], label [[BB7]], label [[SPEC_SELECT_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       BB2.jt2:
+; CHECK-NEXT:    [[TMP8:%.*]] = add i16 [[TMP4]], 1
+; CHECK-NEXT:    store i16 [[TMP8]], ptr @d, align 2
+; CHECK-NEXT:    [[TMP9:%.*]] = load volatile i64, ptr @g, align 8
+; CHECK-NEXT:    [[DOTNOT1_JT2:%.*]] = icmp eq i64 [[TMP9]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT1_JT2]], label [[BB7]], label [[SPEC_SELECT_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       spec.select.si.unfold.false:
+; CHECK-NEXT:    [[DOT1_SI_UNFOLD_PHI:%.*]] = phi i32 [ [[DOT1]], [[BB2]] ]
+; CHECK-NEXT:    br label [[BB7]]
+; CHECK:       spec.select.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOT1_SI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ [[DOT1_JT2]], [[BB2_JT2]] ]
+; CHECK-NEXT:    br label [[BB7_JT2]]
+; CHECK:       BB7:
+; CHECK-NEXT:    [[D_PROMOTED4]] = phi i16 [ [[D_PROMOTED5]], [[BB1:%.*]] ], [ [[TMP6]], [[BB2]] ], [ [[TMP6]], [[SPEC_SELECT_SI_UNFOLD_FALSE]] ], [ [[TMP8]], [[BB2_JT2]] ]
+; CHECK-NEXT:    [[TMP10]] = phi i16 [ [[TMP2]], [[BB1]] ], [ [[TMP6]], [[BB2]] ], [ [[TMP6]], [[SPEC_SELECT_SI_UNFOLD_FALSE]] ], [ [[TMP8]], [[BB2_JT2]] ]
+; CHECK-NEXT:    [[DOT3]] = phi i32 [ [[DOT1]], [[BB1]] ], [ [[DOT_SI_UNFOLD_PHI]], [[BB2]] ], [ [[DOT1_SI_UNFOLD_PHI]], [[SPEC_SELECT_SI_UNFOLD_FALSE]] ], [ [[DOT_SI_UNFOLD_PHI]], [[BB2_JT2]] ]
+; CHECK-NEXT:    switch i32 [[DOT3]], label [[BB9:%.*]] [
+; CHECK-NEXT:      i32 0, label [[BB1_BACKEDGE]]
+; CHECK-NEXT:      i32 7, label [[BB1_BACKEDGE]]
+; CHECK-NEXT:      i32 6, label [[BB8:%.*]]
+; CHECK-NEXT:      i32 2, label [[BB0_LOOPEXIT]]
+; CHECK-NEXT:    ]
+; CHECK:       BB7.jt2:
+; CHECK-NEXT:    [[D_PROMOTED4_JT2]] = phi i16 [ [[D_PROMOTED5_JT2]], [[BB1_JT2]] ], [ [[TMP8]], [[SPEC_SELECT_SI_UNFOLD_FALSE_JT2]] ]
+; CHECK-NEXT:    [[TMP11:%.*]] = phi i16 [ [[TMP4]], [[BB1_JT2]] ], [ [[TMP8]], [[SPEC_SELECT_SI_UNFOLD_FALSE_JT2]] ]
+; CHECK-NEXT:    [[DOT3_JT2:%.*]] = phi i32 [ [[DOT1_JT2]], [[BB1_JT2]] ], [ [[DOT1_SI_UNFOLD_PHI_JT2]], [[SPEC_SELECT_SI_UNFOLD_FALSE_JT2]] ]
+; CHECK-NEXT:    br label [[BB0_LOOPEXIT]]
+; CHECK:       BB1.backedge:
+; CHECK-NEXT:    br label [[BB1]]
+; CHECK:       BB8:
+; CHECK-NEXT:    ret void
+; CHECK:       BB9:
+; CHECK-NEXT:    unreachable
+;
+  %1 = load i32, ptr @h, align 4
+  %.not2 = icmp eq i32 %1, 0
+  %. = select i1 %.not2, i32 0, i32 6
+  %d.promoted3 = load i16, ptr @d, align 1
+  br label %BB0
+
+BB0.loopexit:                                     ; preds = %BB7
+  br label %BB0
+
+BB0:                                              ; preds = %BB0.loopexit, %0
+  %d.promoted6 = phi i16 [ %d.promoted3, %0 ], [ %d.promoted4, %BB0.loopexit ]
+  br label %BB1
+
+BB1:                                              ; preds = %BB1.backedge, %BB0
+  %d.promoted5 = phi i16 [ %d.promoted6, %BB0 ], [ %d.promoted4, %BB1.backedge ]
+  %2 = phi i16 [ %d.promoted6, %BB0 ], [ %6, %BB1.backedge ]
+  %.1 = phi i32 [ 2, %BB0 ], [ %.3, %BB1.backedge ]
+  %3 = load volatile i32, ptr @f, align 4
+  %.not = icmp eq i32 %3, 0
+  br i1 %.not, label %BB7, label %BB2
+
+BB2:                                              ; preds = %BB1
+  %4 = add i16 %2, 1
+  store i16 %4, ptr @d, align 2
+  %5 = load volatile i64, ptr @g, align 8
+  %.not1 = icmp eq i64 %5, 0
+  %spec.select = select i1 %.not1, i32 %., i32 %.1
+  br label %BB7
+
+BB7:                                              ; preds = %BB2, %BB1
+  %d.promoted4 = phi i16 [ %d.promoted5, %BB1 ], [ %4, %BB2 ]
+  %6 = phi i16 [ %2, %BB1 ], [ %4, %BB2 ]
+  %.3 = phi i32 [ %.1, %BB1 ], [ %spec.select, %BB2 ]
+  switch i32 %.3, label %BB9 [
+  i32 0, label %BB1.backedge
+  i32 7, label %BB1.backedge
+  i32 6, label %BB8
+  i32 2, label %BB0.loopexit
+  ]
+
+BB1.backedge:                                     ; preds = %BB7, %BB7
+  br label %BB1
+
+BB8:                                              ; preds = %BB7
+  ret void
+
+BB9:                                              ; preds = %BB7
+  unreachable
+}
+
+
+define void @pr106083_select_dead_uses() {
+; CHECK-LABEL: @pr106083_select_dead_uses(
+; CHECK-NEXT:    [[TMP1:%.*]] = load ptr, ptr @a, align 8
+; CHECK-NEXT:    [[TMP2:%.*]] = load ptr, ptr @i, align 8
+; CHECK-NEXT:    [[TMP3:%.*]] = load i64, ptr @c, align 8
+; CHECK-NEXT:    [[DOTNOT3:%.*]] = icmp eq i64 [[TMP3]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT3]], label [[DOTLOOPEXIT6:%.*]], label [[SPEC_SELECT_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       .loopexit6.loopexit:
+; CHECK-NEXT:    br label [[DOTLOOPEXIT6]]
+; CHECK:       spec.select.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI:%.*]] = phi i32 [ 2, [[TMP0:%.*]] ]
+; CHECK-NEXT:    br label [[DOTLOOPEXIT6]]
+; CHECK:       .loopexit6:
+; CHECK-NEXT:    [[SPEC_SELECT_SI_UNFOLD_PHI:%.*]] = phi i32 [ [[SPEC_SELECT_SI_UNFOLD_PHI]], [[DOTLOOPEXIT6_LOOPEXIT:%.*]] ], [ 0, [[TMP0]] ], [ [[DOTSI_UNFOLD_PHI]], [[SPEC_SELECT_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    br label [[TMP6:%.*]]
+; CHECK:       4:
+; CHECK-NEXT:    [[DOT1:%.*]] = phi i32 [ [[DOT21:%.*]], [[DOTBACKEDGE:%.*]] ]
+; CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[TMP1]], align 4
+; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp eq i32 [[TMP5]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT]], label [[SELECT_UNFOLD_JT0:%.*]], label [[TMP8:%.*]]
+; CHECK:       6:
+; CHECK-NEXT:    [[DOT1_JT2:%.*]] = phi i32 [ 2, [[DOTLOOPEXIT6]] ]
+; CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[TMP1]], align 4
+; CHECK-NEXT:    [[DOTNOT_JT2:%.*]] = icmp eq i32 [[TMP7]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT_JT2]], label [[SELECT_UNFOLD_JT0]], label [[TMP10:%.*]]
+; CHECK:       8:
+; CHECK-NEXT:    [[TMP9:%.*]] = load i32, ptr [[TMP2]], align 4
+; CHECK-NEXT:    [[DOTNOT2:%.*]] = icmp eq i32 [[TMP9]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT2]], label [[SELECT_UNFOLD:%.*]], label [[SPEC_SELECT7_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       10:
+; CHECK-NEXT:    [[TMP11:%.*]] = load i32, ptr [[TMP2]], align 4
+; CHECK-NEXT:    [[DOTNOT2_JT2:%.*]] = icmp eq i32 [[TMP11]], 0
+; CHECK-NEXT:    br i1 [[DOTNOT2_JT2]], label [[SELECT_UNFOLD]], label [[SPEC_SELECT7_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       spec.select7.si.unfold.false:
+; CHECK-NEXT:    [[DOT1_SI_UNFOLD_PHI:%.*]] = phi i32 [ [[DOT1]], [[TMP8]] ]
+; CHECK-NEXT:    br label [[SELECT_UNFOLD]]
+; CHECK:       spec.select7.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOT1_SI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ [[DOT1_JT2]], [[TMP10]] ]
+; CHECK-NEXT:    br label [[SELECT_UNFOLD_JT2:%.*]]
+; CHECK:       select.unfold:
+; CHECK-NEXT:    [[DOT2:%.*]] = phi i32 [ [[SPEC_SELECT_SI_UNFOLD_PHI]], [[TMP8]] ], [ [[DOT1_SI_UNFOLD_PHI]], [[SPEC_SELECT7_SI_UNFOLD_FALSE]] ], [ [[SPEC_SELECT_SI_UNFOLD_PHI]], [[TMP10]] ]
+; CHECK-NEXT:    switch i32 [[DOT2]], label [[TMP12:%.*]] [
+; CHECK-NEXT:      i32 0, label [[DOTPREHEADER_PREHEADER:%.*]]
+; CHECK-NEXT:      i32 2, label [[DOTLOOPEXIT6_LOOPEXIT]]
+; CHECK-NEXT:    ]
+; CHECK:       select.unfold.jt2:
+; CHECK-NEXT:    [[DOT2_JT2:%.*]] = phi i32 [ [[DOT1_SI_UNFOLD_PHI_JT2]], [[SPEC_SELECT7_SI_UNFOLD_FALSE_JT2]] ]
+; CHECK-NEXT:    br label [[DOTLOOPEXIT6_LOOPEXIT]]
+; CHECK:       select.unfold.jt0:
+; CHECK-NEXT:    [[DOT2_JT0:%.*]] = phi i32 [ 0, [[TMP4:%.*]] ], [ 0, [[TMP6]] ]
+; CHECK-NEXT:    br label [[DOTPREHEADER_PREHEADER]]
+; CHECK:       .preheader.preheader:
+; CHECK-NEXT:    [[DOT21]] = phi i32 [ [[DOT2_JT0]], [[SELECT_UNFOLD_JT0]] ], [ [[DOT2]], [[SELECT_UNFOLD]] ]
+; CHECK-NEXT:    store i32 0, ptr @b, align 4
+; CHECK-NEXT:    br label [[DOTBACKEDGE]]
+; CHECK:       .backedge:
+; CHECK-NEXT:    br label [[TMP4]]
+; CHECK:       12:
+; CHECK-NEXT:    unreachable
+;
+  %1 = load ptr, ptr @a, align 8
+  %2 = load ptr, ptr @i, align 8
+  %3 = load i64, ptr @c, align 8
+  %.not3 = icmp eq i64 %3, 0
+  %spec.select = select i1 %.not3, i32 0, i32 2
+  br label %.loopexit6
+
+.loopexit6.loopexit:                              ; preds = %select.unfold
+  br label %.loopexit6
+
+.loopexit6:                                       ; preds = %.loopexit6.loopexit, %0
+  br label %4
+
+4:                                                ; preds = %.backedge, %.loopexit6
+  %.1 = phi i32 [ 2, %.loopexit6 ], [ %.2, %.backedge ]
+  %5 = load i32, ptr %1, align 4
+  %.not = icmp eq i32 %5, 0
+  br i1 %.not, label %select.unfold, label %6
+
+6:                                                ; preds = %4
+  %7 = load i32, ptr %2, align 4
+  %.not2 = icmp eq i32 %7, 0
+  %spec.select7 = select i1 %.not2, i32 %spec.select, i32 %.1
+  br label %select.unfold
+
+select.unfold:                                    ; preds = %6, %4
+  %.2 = phi i32 [ 0, %4 ], [ %spec.select7, %6 ]
+  switch i32 %.2, label %8 [
+  i32 0, label %.preheader.preheader
+  i32 2, label %.loopexit6.loopexit
+  ]
+
+.preheader.preheader:                             ; preds = %select.unfold
+  store i32 0, ptr @b, align 4
+  br label %.backedge
+
+.backedge:                                        ; preds = %.preheader.preheader
+  br label %4
+
+8:                                                ; preds = %select.unfold
+  unreachable
+}

@JonPsson1
Copy link
Contributor

@XChy @ollef Maybe you could take a look and review as you seem to have some familiarity with this?

@UsmanNadeem UsmanNadeem requested a review from XChy September 22, 2024 20:57
@mikaelholmen
Copy link
Collaborator

I've confirmed that this patch makes the crash I reported in
#96127 (comment)
go away.

The one I reported in
#96127 (comment)
still crashes with the patch.

…ect successor

Previously the code assumed that the select instruction is defined
in a block that is a direct predecessor of the block where the PHINode
uses it. So, we were hitting an assertion when we tried to access the
def block as an incoming block for the user phi node.

This patch handles that case by using the correct end block
and creating a new phi node that aggregates both the values of the
select in that end block, and then using that new unfolded phi to
overwrite the original user phi node.

Fixes llvm#106083

Change-Id: Ie471994cca232318f74a6e6438efa21e561c2dc0
@UsmanNadeem
Copy link
Contributor Author

@mikaelholmen should be fixed now. Can you please verify?
Also, which applications did you compile to uncover these bugs?

Change-Id: I847df2d1e22aebee0a79ffc8f7d57271a0028f65
@UsmanNadeem UsmanNadeem requested a review from XChy September 23, 2024 23:31
@mikaelholmen
Copy link
Collaborator

@mikaelholmen should be fixed now. Can you please verify? Also, which applications did you compile to uncover these bugs?

Hi!
Yes, now the patch fixes both crashes I saw.
The original C programs that exposed the crashes were generated with Csmith.

Copy link
Member

@XChy XChy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@UsmanNadeem UsmanNadeem merged commit d4a38c8 into llvm:main Sep 24, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DFAJumpThreading] crash since b167ada
5 participants