Skip to content

[ValueTracking] Make the MaxAnalysisRecursionDepth overridable #137721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jrbyrnes
Copy link
Contributor

@jrbyrnes jrbyrnes commented Apr 28, 2025

In some cases, the recursion depth limit will block optimizations. While, in the general case, we may be willing to accept slightly less optimal code in order to achieve reasonable compile times. However, in some individual cases, the degradation in the quality of the generated code may be unacceptable. Thus, I think it makes sense to make this depth limit a bit more flexible. I have opened a corresponding RFC to gather feedback https://discourse.llvm.org/t/rfc-computeknownbits-recursion-depth/85962 .

This PR adds a variant of computeKnownBits method which does exhaustive recursion (with tracking to avoid infinite recursion), and clients can use this if they feel the penalty of a missed optimization outweighs the additional search.

@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-globalisel

Author: Jeffrey Byrnes (jrbyrnes)

Changes

In some cases, the recursion depth limit will block optimizations. While, in the general case, we may be willing to accept slightly less optimal code in order to achieve reasonable compile times. However, in some individual cases, the degradation in the quality of the generated code may be unacceptable. Thus, I think it makes sense to make this depth limit a bit more flexible. I have opened a corresponding RFC to gather feedback https://discourse.llvm.org/t/rfc-computeknownbits-recursion-depth/85962 .

This particular PR just makes the limit overridable via a flag. However, I think it makes sense as a followup PR to add a method which does exhaustive recursion, and clients can use this if they feel the penalty of a missed optimization outweighs the additional search.

I'm still looking into creating tests for the different paths to getAnalysisRecursionDepthLimit but I thought I'd add this PR in case there is any strong push back.


Patch is 27.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/137721.diff

10 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+2)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+40-30)
  • (modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+1-1)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+1-1)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp (+1-1)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+3-3)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp (+6-5)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+5-4)
  • (modified) llvm/lib/Transforms/Scalar/ConstraintElimination.cpp (+6-3)
  • (added) llvm/test/Transforms/InstCombine/simplifydemanded-depth.ll (+80)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index f927838c843ac..a6f756d370c06 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -44,6 +44,8 @@ template <typename T> class ArrayRef;
 
 constexpr unsigned MaxAnalysisRecursionDepth = 6;
 
+unsigned getAnalysisRecursionDepthLimit();
+
 /// Determine which bits of V are known to be either zero or one and return
 /// them in the KnownZero/KnownOne bit sets.
 ///
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 1d3f8b7207a63..f1bd1ec5376df 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -88,6 +88,8 @@ using namespace llvm::PatternMatch;
 static cl::opt<unsigned> DomConditionsMaxUses("dom-conditions-max-uses",
                                               cl::Hidden, cl::init(20));
 
+static cl::opt<bool> ExhaustiveRecursion("exhaustive-analysis-recursion",
+                                         cl::Hidden);
 
 /// Returns the bitwidth of the given scalar or pointer type. For vector types,
 /// returns the element type's bitwidth.
@@ -129,6 +131,12 @@ static bool getShuffleDemandedElts(const ShuffleVectorInst *Shuf,
                                       DemandedElts, DemandedLHS, DemandedRHS);
 }
 
+unsigned llvm::getAnalysisRecursionDepthLimit() {
+  if (!ExhaustiveRecursion.getNumOccurrences() || !ExhaustiveRecursion)
+    return MaxAnalysisRecursionDepth;
+  return -1;
+}
+
 static void computeKnownBits(const Value *V, const APInt &DemandedElts,
                              KnownBits &Known, unsigned Depth,
                              const SimplifyQuery &Q);
@@ -793,7 +801,7 @@ static void computeKnownBitsFromCond(const Value *V, Value *Cond,
                                      KnownBits &Known, unsigned Depth,
                                      const SimplifyQuery &SQ, bool Invert) {
   Value *A, *B;
-  if (Depth < MaxAnalysisRecursionDepth &&
+  if (Depth < getAnalysisRecursionDepthLimit() &&
       match(Cond, m_LogicalOp(m_Value(A), m_Value(B)))) {
     KnownBits Known2(Known.getBitWidth());
     KnownBits Known3(Known.getBitWidth());
@@ -828,7 +836,8 @@ static void computeKnownBitsFromCond(const Value *V, Value *Cond,
     return;
   }
 
-  if (Depth < MaxAnalysisRecursionDepth && match(Cond, m_Not(m_Value(A))))
+  if (Depth < getAnalysisRecursionDepthLimit() &&
+      match(Cond, m_Not(m_Value(A))))
     computeKnownBitsFromCond(V, A, Known, Depth + 1, SQ, !Invert);
 }
 
@@ -922,7 +931,7 @@ void llvm::computeKnownBitsFromContext(const Value *V, KnownBits &Known,
     }
 
     // The remaining tests are all recursive, so bail out if we hit the limit.
-    if (Depth == MaxAnalysisRecursionDepth)
+    if (Depth == getAnalysisRecursionDepthLimit())
       continue;
 
     ICmpInst *Cmp = dyn_cast<ICmpInst>(Arg);
@@ -1690,7 +1699,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
 
     // Otherwise take the unions of the known bit sets of the operands,
     // taking conservative care to avoid excessive recursion.
-    if (Depth < MaxAnalysisRecursionDepth - 1 && Known.isUnknown()) {
+    if (Depth < getAnalysisRecursionDepthLimit() - 1 && Known.isUnknown()) {
       // Skip if every incoming value references to ourself.
       if (isa_and_nonnull<UndefValue>(P->hasConstantValue()))
         break;
@@ -1719,7 +1728,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
         // TODO: See if we can base recursion limiter on number of incoming phi
         // edges so we don't overly clamp analysis.
         computeKnownBits(IncValue, DemandedElts, Known2,
-                         MaxAnalysisRecursionDepth - 1, RecQ);
+                         getAnalysisRecursionDepthLimit() - 1, RecQ);
 
         // See if we can further use a conditional branch into the phi
         // to help us determine the range of the value.
@@ -2188,7 +2197,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
   }
 
   assert(V && "No Value?");
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
 
 #ifndef NDEBUG
   Type *Ty = V->getType();
@@ -2287,7 +2296,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
       Known = Range->toKnownBits();
 
   // All recursive calls that increase depth must come after this.
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return;
 
   // A weak GlobalAlias is totally unknown. A non-weak GlobalAlias has
@@ -2400,7 +2409,7 @@ static bool isImpliedToBeAPowerOfTwoFromCond(const Value *V, bool OrZero,
 /// types and vectors of integers.
 bool llvm::isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth,
                                   const SimplifyQuery &Q) {
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
 
   if (isa<Constant>(V))
     return OrZero ? match(V, m_Power2OrZero()) : match(V, m_Power2());
@@ -2462,7 +2471,7 @@ bool llvm::isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth,
     return true;
 
   // The remaining tests are all recursive, so bail out if we hit the limit.
-  if (Depth++ == MaxAnalysisRecursionDepth)
+  if (Depth++ == getAnalysisRecursionDepthLimit())
     return false;
 
   switch (I->getOpcode()) {
@@ -2550,7 +2559,7 @@ bool llvm::isKnownToBeAPowerOfTwo(const Value *V, bool OrZero, unsigned Depth,
 
     // Recursively check all incoming values. Limit recursion to 2 levels, so
     // that search complexity is limited to number of operands^2.
-    unsigned NewDepth = std::max(Depth, MaxAnalysisRecursionDepth - 1);
+    unsigned NewDepth = std::max(Depth, getAnalysisRecursionDepthLimit() - 1);
     return llvm::all_of(PN->operands(), [&](const Use &U) {
       // Value is power of 2 if it is coming from PHI node itself by induction.
       if (U.get() == PN)
@@ -2654,7 +2663,7 @@ static bool isGEPKnownNonNull(const GEPOperator *GEP, unsigned Depth,
     // to recurse 10k times just because we have 10k GEP operands. We don't
     // bail completely out because we want to handle constant GEPs regardless
     // of depth.
-    if (Depth++ >= MaxAnalysisRecursionDepth)
+    if (Depth++ >= getAnalysisRecursionDepthLimit())
       continue;
 
     if (isKnownNonZero(GTI.getOperand(), Q, Depth))
@@ -3158,7 +3167,7 @@ static bool isKnownNonZeroFromOperator(const Operator *I,
 
     // Check if all incoming values are non-zero using recursion.
     SimplifyQuery RecQ = Q.getWithoutCondContext();
-    unsigned NewDepth = std::max(Depth, MaxAnalysisRecursionDepth - 1);
+    unsigned NewDepth = std::max(Depth, getAnalysisRecursionDepthLimit() - 1);
     return llvm::all_of(PN->operands(), [&](const Use &U) {
       if (U.get() == PN)
         return true;
@@ -3424,7 +3433,7 @@ bool isKnownNonZero(const Value *V, const APInt &DemandedElts,
   Type *Ty = V->getType();
 
 #ifndef NDEBUG
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
 
   if (auto *FVTy = dyn_cast<FixedVectorType>(Ty)) {
     assert(
@@ -3487,7 +3496,7 @@ bool isKnownNonZero(const Value *V, const APInt &DemandedElts,
     return true;
 
   // Some of the tests below are recursive, so bail out if we hit the limit.
-  if (Depth++ >= MaxAnalysisRecursionDepth)
+  if (Depth++ >= getAnalysisRecursionDepthLimit())
     return false;
 
   // Check for pointer simplifications.
@@ -3871,7 +3880,7 @@ static bool isKnownNonEqual(const Value *V1, const Value *V2,
     // We can't look through casts yet.
     return false;
 
-  if (Depth >= MaxAnalysisRecursionDepth)
+  if (Depth >= getAnalysisRecursionDepthLimit())
     return false;
 
   // See if we can recurse through (exactly one of) our operands.  This
@@ -3988,7 +3997,7 @@ static unsigned ComputeNumSignBitsImpl(const Value *V,
                                        unsigned Depth, const SimplifyQuery &Q) {
   Type *Ty = V->getType();
 #ifndef NDEBUG
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
 
   if (auto *FVTy = dyn_cast<FixedVectorType>(Ty)) {
     assert(
@@ -4015,7 +4024,7 @@ static unsigned ComputeNumSignBitsImpl(const Value *V,
   // Note that ConstantInt is handled by the general computeKnownBits case
   // below.
 
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return 1;
 
   if (auto *U = dyn_cast<Operator>(V)) {
@@ -4970,7 +4979,7 @@ static void computeKnownFPClassFromCond(const Value *V, Value *Cond,
                                         const Instruction *CxtI,
                                         KnownFPClass &KnownFromContext) {
   Value *A, *B;
-  if (Depth < MaxAnalysisRecursionDepth &&
+  if (Depth < getAnalysisRecursionDepthLimit() &&
       (CondIsTrue ? match(Cond, m_LogicalAnd(m_Value(A), m_Value(B)))
                   : match(Cond, m_LogicalOr(m_Value(A), m_Value(B))))) {
     computeKnownFPClassFromCond(V, A, Depth + 1, CondIsTrue, CxtI,
@@ -4979,7 +4988,8 @@ static void computeKnownFPClassFromCond(const Value *V, Value *Cond,
                                 KnownFromContext);
     return;
   }
-  if (Depth < MaxAnalysisRecursionDepth && match(Cond, m_Not(m_Value(A)))) {
+  if (Depth < getAnalysisRecursionDepthLimit() &&
+      match(Cond, m_Not(m_Value(A)))) {
     computeKnownFPClassFromCond(V, A, Depth + 1, !CondIsTrue, CxtI,
                                 KnownFromContext);
     return;
@@ -5106,7 +5116,7 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
     return;
   }
 
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
 
   if (auto *CFP = dyn_cast<ConstantFP>(V)) {
     Known.KnownFPClasses = CFP->getValueAPF().classify();
@@ -5200,7 +5210,7 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
     return;
 
   // All recursive calls that increase depth must come after this.
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return;
 
   const unsigned Opc = Op->getOpcode();
@@ -6144,7 +6154,7 @@ void computeKnownFPClass(const Value *V, const APInt &DemandedElts,
 
     // Otherwise take the unions of the known bit sets of the operands,
     // taking conservative care to avoid excessive recursion.
-    const unsigned PhiRecursionLimit = MaxAnalysisRecursionDepth - 2;
+    const unsigned PhiRecursionLimit = getAnalysisRecursionDepthLimit() - 2;
 
     if (Depth < PhiRecursionLimit) {
       // Skip if every incoming value references to ourself.
@@ -7857,7 +7867,7 @@ static bool programUndefinedIfUndefOrPoison(const Value *V, bool PoisonOnly);
 static bool isGuaranteedNotToBeUndefOrPoison(
     const Value *V, AssumptionCache *AC, const Instruction *CtxI,
     const DominatorTree *DT, unsigned Depth, UndefPoisonKind Kind) {
-  if (Depth >= MaxAnalysisRecursionDepth)
+  if (Depth >= getAnalysisRecursionDepthLimit())
     return false;
 
   if (isa<MetadataAsValue>(V))
@@ -9193,7 +9203,7 @@ static Value *lookThroughCast(CmpInst *CmpI, Value *V1, Value *V2,
 SelectPatternResult llvm::matchSelectPattern(Value *V, Value *&LHS, Value *&RHS,
                                              Instruction::CastOps *CastOp,
                                              unsigned Depth) {
-  if (Depth >= MaxAnalysisRecursionDepth)
+  if (Depth >= getAnalysisRecursionDepthLimit())
     return {SPF_UNKNOWN, SPNB_NA, false};
 
   SelectInst *SI = dyn_cast<SelectInst>(V);
@@ -9613,10 +9623,10 @@ isImpliedCondICmps(const ICmpInst *LHS, CmpPredicate RPred, const Value *R0,
     // C1` (see discussion: D58633).
     ConstantRange LCR = computeConstantRange(
         L1, ICmpInst::isSigned(LPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
-        /*CxtI=*/nullptr, /*DT=*/nullptr, MaxAnalysisRecursionDepth - 1);
+        /*CxtI=*/nullptr, /*DT=*/nullptr, getAnalysisRecursionDepthLimit() - 1);
     ConstantRange RCR = computeConstantRange(
         R1, ICmpInst::isSigned(RPred), /* UseInstrInfo=*/true, /*AC=*/nullptr,
-        /*CxtI=*/nullptr, /*DT=*/nullptr, MaxAnalysisRecursionDepth - 1);
+        /*CxtI=*/nullptr, /*DT=*/nullptr, getAnalysisRecursionDepthLimit() - 1);
     // Even if L1/R1 are not both constant, we can still sometimes deduce
     // relationship from a single constant. For example X u> Y implies X != 0.
     if (auto R = isImpliedCondCommonOperandWithCR(LPred, LCR, RPred, RCR))
@@ -9681,7 +9691,7 @@ isImpliedCondAndOr(const Instruction *LHS, CmpPredicate RHSPred,
           LHS->getOpcode() == Instruction::Select) &&
          "Expected LHS to be 'and', 'or', or 'select'.");
 
-  assert(Depth <= MaxAnalysisRecursionDepth && "Hit recursion limit");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Hit recursion limit");
 
   // If the result of an 'or' is false, then we know both legs of the 'or' are
   // false.  Similarly, if the result of an 'and' is true, then we know both
@@ -9706,7 +9716,7 @@ llvm::isImpliedCondition(const Value *LHS, CmpPredicate RHSPred,
                          const Value *RHSOp0, const Value *RHSOp1,
                          const DataLayout &DL, bool LHSIsTrue, unsigned Depth) {
   // Bail out when we hit the limit.
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return std::nullopt;
 
   // A mismatch occurs when we compare a scalar cmp to a vector cmp, for
@@ -9762,7 +9772,7 @@ std::optional<bool> llvm::isImpliedCondition(const Value *LHS, const Value *RHS,
     return std::nullopt;
   }
 
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return std::nullopt;
 
   // LHS ==> (RHS1 || RHS2) if LHS ==> RHS1 or LHS ==> RHS2
@@ -10224,7 +10234,7 @@ ConstantRange llvm::computeConstantRange(const Value *V, bool ForSigned,
                                          unsigned Depth) {
   assert(V->getType()->isIntOrIntVectorTy() && "Expected integer instruction");
 
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return ConstantRange::getFull(V->getType()->getScalarSizeInBits());
 
   if (auto *C = dyn_cast<Constant>(V))
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index d8cc86b34a819..5bca26886b6e4 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -1907,7 +1907,7 @@ static bool isGuaranteedNotToBeUndefOrPoison(Register Reg,
                                              const MachineRegisterInfo &MRI,
                                              unsigned Depth,
                                              UndefPoisonKind Kind) {
-  if (Depth >= MaxAnalysisRecursionDepth)
+  if (Depth >= getAnalysisRecursionDepthLimit())
     return false;
 
   MachineInstr *RegDef = MRI.getVRegDef(Reg);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index b7b0bb7361359..a377440742650 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -4371,7 +4371,7 @@ static bool isMaskOrZero(const Value *V, bool Not, const SimplifyQuery &Q,
     return true;
   if (V->getType()->getScalarSizeInBits() == 1)
     return true;
-  if (Depth++ >= MaxAnalysisRecursionDepth)
+  if (Depth++ >= getAnalysisRecursionDepthLimit())
     return false;
   Value *X;
   const Instruction *I = dyn_cast<Instruction>(V);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index c7023eb79b04e..333ecff80d64b 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -1534,7 +1534,7 @@ Value *InstCombinerImpl::takeLog2(Value *Op, unsigned Depth, bool AssumeNonZero,
     });
 
   // The remaining tests are all recursive, so bail out if we hit the limit.
-  if (Depth++ == MaxAnalysisRecursionDepth)
+  if (Depth++ == getAnalysisRecursionDepthLimit())
     return nullptr;
 
   // log2(zext X) -> zext log2(X)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 4bba2f406b4c1..b47106042c2a8 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -3629,7 +3629,7 @@ static bool matchFMulByZeroIfResultEqZero(InstCombinerImpl &IC, Value *Cmp0,
 /// select condition.
 static bool hasAffectedValue(Value *V, SmallPtrSetImpl<Value *> &Affected,
                              unsigned Depth) {
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return false;
 
   // Ignore the case where the select arm itself is affected. These cases
@@ -3639,9 +3639,9 @@ static bool hasAffectedValue(Value *V, SmallPtrSetImpl<Value *> &Affected,
 
   if (auto *I = dyn_cast<Instruction>(V)) {
     if (isa<PHINode>(I)) {
-      if (Depth == MaxAnalysisRecursionDepth - 1)
+      if (Depth == getAnalysisRecursionDepthLimit() - 1)
         return false;
-      Depth = MaxAnalysisRecursionDepth - 2;
+      Depth = getAnalysisRecursionDepthLimit() - 2;
     }
     return any_of(I->operands(), [&](Value *Op) {
       return Op->getType()->isIntOrIntVectorTy() &&
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index 2c8939b5a0514..f1b65aaf241f9 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -116,7 +116,7 @@ bool InstCombinerImpl::SimplifyDemandedBits(Instruction *I, unsigned OpNo,
     return false;
   }
 
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return false;
 
   Value *NewVal;
@@ -166,7 +166,7 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Instruction *I,
                                                  unsigned Depth,
                                                  const SimplifyQuery &Q) {
   assert(I != nullptr && "Null pointer of Value???");
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
   uint32_t BitWidth = DemandedMask.getBitWidth();
   Type *VTy = I->getType();
   assert(
@@ -1450,7 +1450,8 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V,
   }
 
   // Limit search depth.
-  if (Depth == SimplifyDemandedVectorEltsDepthLimit)
+  if (Depth == SimplifyDemandedVectorEltsDepthLimit &&
+      Depth >= getAnalysisRecursionDepthLimit())
     return nullptr;
 
   if (!AllowMultipleUsers) {
@@ -1962,7 +1963,7 @@ static Constant *getFPClassConstant(Type *Ty, FPClassTest Mask) {
 Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
     Value *V, const FPClassTest DemandedMask, KnownFPClass &Known,
     unsigned Depth, Instruction *CxtI) {
-  assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");
+  assert(Depth <= getAnalysisRecursionDepthLimit() && "Limit Search Depth");
   Type *VTy = V->getType();
 
   assert(Known == KnownFPClass() && "expected uninitialized state");
@@ -1970,7 +1971,7 @@ Value *InstCombinerImpl::SimplifyDemandedUseFPClass(
   if (DemandedMask == fcNone)
     return isa<UndefValue>(V) ? nullptr : PoisonValue::get(VTy);
 
-  if (Depth == MaxAnalysisRecursionDepth)
+  if (Depth == getAnalysisRecursionDepthLimit())
     return nullptr;
 
   Instruction *I = dyn_cast<Instruction>(V);
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index f807f5f4519fc..b01124f5a649c 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2587,7 +2587,7 @@ Value *InstCombiner::getFreelyInvertedImpl(Value *V, bool WillInvertAllUses,
   if (match(V, m_ImmConstant(C)))
...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without commenting on anything else, it's not possible to make the limit "unlimited", because (even though this is not its primary purpose) the depth limit is also load bearing to avoid infinite recursion for unreachable IR that contains cycles.

@@ -793,7 +801,7 @@ static void computeKnownBitsFromCond(const Value *V, Value *Cond,
KnownBits &Known, unsigned Depth,
const SimplifyQuery &SQ, bool Invert) {
Value *A, *B;
if (Depth < MaxAnalysisRecursionDepth &&
if (Depth < getAnalysisRecursionDepthLimit() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid calling this helper function multiple times inside a function. Compute it once and then reuse the result. This PR contains many such instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops -- still need to address this.

@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented May 5, 2025

To suppoort the request:

should be a local pass parameter if configurable

I am thinking to change the computeKnownBits methods to provide an override flag.

By doing so, I think we can also address

the depth limit is also load bearing to avoid infinite recursion for unreachable IR that contains cycles

I think the infinite recursion can only be triggered through PHINodes / calls .. so we can just pessimistically never override the depth limit for these cases.

Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be preferable if we could count Depth downwards instead of upwards, as many of the APIs already have a depth parameter. Then, we can pass the max-depth, and it would count upward from 0, instead of passing 0 and relying on a hard-coded parameter. I'll work on a patch for this shortly. This would also match the way InstSimplify counts wrt to its Depth parameter.

@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Jun 2, 2025

I think it would be preferable if we could count Depth downwards instead of upwards, as many of the APIs already have a depth parameter. Then, we can pass the max-depth, and it would count upward from 0, instead of passing 0 and relying on a hard-coded parameter. I'll work on a patch for this shortly. This would also match the way InstSimplify counts wrt to its Depth parameter.

Sounds like you're planning on changing the implementation s.t. clients of computeKnownBits should specify the recurison depth max. Wouldn't this create an explosion of tuneable parameters? Is #142384 the start of this effort?

I'm trying to gauge whether or not I should abandon this PR.

@artagnon
Copy link
Contributor

artagnon commented Jun 2, 2025

Sounds like you're planning on changing the implementation s.t. clients of computeKnownBits should specify the recurison depth max. Wouldn't this create an explosion of tuneable parameters? Is #142384 the start of this effort?

Actually, no: the plan is to strip the parameter altogether, by fixing the underlying issue with a fresh KnownBitsAnalysis with an invalidation strategy. It's going to be a major investment, but I think it will solve all our issues once and for all. #142384 is the start of this effort. I thought about the problem some more after my previous comment.

I'm trying to gauge whether or not I should abandon this PR.

I think so. The Depth parameter is a hack, and we're crippled by its limitations. Rather than an ad-hoc solution, my proposal is to fix it once and for all.

@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Jun 2, 2025

Okay great -- glad to see some momentum on this. FYI, please see: https://discourse.llvm.org/t/rfc-computeknownbits-recursion-depth/85962 . The KnownBits caching + invalidation has been discussed there.

It's going to be a major investment

Due to this reason, I think we can not abandon this PR, at least for our purposes. I assume this implies a timeline of 6-12 months, and I think we cannot live with exposure to this issue for that long. So, I will continue working on solution to provide some potentially temporary relief.

jrbyrnes added 2 commits June 3, 2025 13:59
Change-Id: Id715df21631213e3ac77bf6d24a41375dab194b9
Change-Id: Icf0dac8c87812aa8edff53a1d5cdd664ab6d6d12
@jrbyrnes
Copy link
Contributor Author

jrbyrnes commented Jun 4, 2025

Reworked the API such that the exhaustive analysis is now a separate util rather than cl option. I've also used a tracking mechanism to signal infinite loops and exit. This is done via a DenseMap of instructions, which has some overhead, however, the underlying principle of the exhaustive analysis is to pay more computation for increased chances of optimization.

I've included the usecase + test as well.

I'm still working on testing, but I am sort of settled on the implementation and am curious if there are any concerns.

Change-Id: I26df29185e96cf81769107c759513dfbb2d33995
const Instruction *CxtI,
const DominatorTree *DT,
bool UseInstrInfo) {
DepthLimit::setOverrideDepthLimit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be some side-effecting state

@@ -927,7 +938,7 @@ void llvm::computeKnownBitsFromContext(const Value *V, KnownBits &Known,
}

// The remaining tests are all recursive, so bail out if we hit the limit.
if (Depth == MaxAnalysisRecursionDepth)
if (Depth == DepthLimit::getMaxRecursionDepth())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this used to start at the max depth and counted down to 0. Can you do that and avoid needing to track some global configuration?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants