[analyzer] Workaround for slowdown spikes (unintended scope increase) #136720

NagyDonat · 2025-04-22T15:52:40Z

Recently some users reported that they observed large increases of runtime (up to +600% on some translation units) when they upgraded to a more recent (slightly patched, internal) clang version. Bisection revealed that the bulk of this increase was probably caused by my earlier commit bb27d5e ("Don't assume third iteration in loops").

As I evaluated that earlier commit on several open source project, it turns out that on average it's runtime-neutral (or slightly helpful: it reduced the total analysis time by 1.5%) but it can cause runtime spikes on some code: in particular it more than doubled the time to analyze tmux (one of the smaller test projects).

Further profiling and investigation proved that these spikes were caused by an increase of analysis scope because there was an heuristic that placed functions on a "don't inline this" blacklist if they reached the -analyzer-max-loop limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.)

I feel that this heuristic for the "don't inline" blacklist is unjustified and arbitrary, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope.

However, the runtime increases are annoying for the users whose project is affected, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of ambiguous loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it).

Investigating this issue was a team effort: I'm grateful to Endre Fülöp (gamesh411) who did the bisection and shared his time measurement setup, and Gábor Tóthvári (tigbr) who helped me in profiling.

Recently some users reported that the runtime of the static analyzer drastically increased on their codebase when they upgraded to a more recent (sligthly patched, internal) clang version: some of their translation units got more than +600% runtime (yes, with two zeroes) and IIUC the average increase was also high. Bisection revealed that the bulk of the increase was caused by my earlier commit bb27d5e ("Don't assume third iteration in loops") and I verified that the slowdown affects the downstream clang. (I used tmux as an arbitrary easy-to-analyze open source project and measured that reverting this commit reduces the total runtime by more than 50% compared to a recent main revision.) Further profiling and investigation proved that this increase is caused by an _increase of analysis scope_ because there was an arbitrary heuristic that placed functions on a "don't inline this" blacklist if they reached the `-analyzer-max-loop` limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.) I feel that this heuristic for the "don't inline" blacklist is unjustified, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope. However, the brutal jump of analysis runtime significantly degrades the user experience, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of opaque loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it).

llvmbot · 2025-04-22T15:53:16Z

@llvm/pr-subscribers-clang

Author: Donát Nagy (NagyDonat)

Changes

Recently some users reported that the runtime of the static analyzer drastically increased on their codebase when they upgraded to a more recent (sligthly patched, internal) clang version: some of their translation units got more than +600% runtime (yes, with two zeroes) and IIUC the average increase was also high.

Bisection revealed that the bulk of the increase was caused by my earlier commit bb27d5e ("Don't assume third iteration in loops") and I verified that the slowdown affects the downstream clang. (I used tmux as an arbitrary easy-to-analyze open source project and measured that reverting this commit reduces the total runtime by more than 50% compared to a recent main revision.)

Further profiling and investigation proved that this increase is caused by an increase of analysis scope because there was an arbitrary heuristic that placed functions on a "don't inline this" blacklist if they reached the -analyzer-max-loop limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.)

I feel that this heuristic for the "don't inline" blacklist is unjustified, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope.

However, the brutal jump of analysis runtime significantly degrades the user experience, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of opaque loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it).

Investigating this issue was a team effort: I'm grateful to Endre Fülöp (gamesh411) who did the bisection and shared his time measurement setup, and Gábor Tóthvári who helped me in profiling.

My preliminary measurements confirm that this commit does improve the runtime: I checked the analysis of window-copy.c from tmux where the analysis takes roughly:

3 minutes on a recent main revision (the parent of this commit);
50 seconds if the "Don't assume third iteration" commit is reverted;
40 seconds if this commit is applied.
In the upcoming days I'll do more accurate time measurements and check the impact of this change on the quality and quantity of the results.

Full diff: https://github.com/llvm/llvm-project/pull/136720.diff

6 Files Affected:

(modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+14)
(modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h (-4)
(modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+49-11)
(modified) clang/test/Analysis/analyzer-config.c (+1)
(added) clang/test/Analysis/legacy-inlining-prevention.c (+198)
(modified) clang/test/Analysis/loop-unrolling.cpp (+23-7)

diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index f9f22a9ced650..0ac85b14147e0 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -414,6 +414,20 @@ ANALYZER_OPTION(
     "any target.",
     true)
 
+ANALYZER_OPTION(
+    bool, LegacyInliningPrevention, "legacy-inlining-prevention",
+    "If enabled, the analyzer puts functions on a \"do not inline this\" list "
+    "if it finds an execution path within that function that may potentially "
+    "perform 'analyzer-max-loop' (= 4 by default) iterations in a loop. "
+    "Note that functions that _definitely_ reach the loop limit on some "
+    "execution path are currently marked as \"do not inline\" even if this "
+    "option is disabled (but this may change in future versions). This option "
+    "is a dumb and arbitrary restriction on inlining, but disabling it would "
+    "significantly increase the analysis workload (and the time taken) "
+    "compared to older clang versions because more top-level functions can "
+    "use up their 'max-nodes' limit if inlining is not restricted.",
+    true)
+
 //===----------------------------------------------------------------------===//
 // Unsigned analyzer options.
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
index 3ee0d229cfc29..761395260a0cf 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
@@ -81,10 +81,6 @@ class FunctionSummariesTy {
     I->second.MayInline = 0;
   }
 
-  void markReachedMaxBlockCount(const Decl *D) {
-    markShouldNotInline(D);
-  }
-
   std::optional<bool> mayInline(const Decl *D) {
     MapTy::const_iterator I = Map.find(D);
     if (I != Map.end() && I->second.InlineChecked)
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 86e2e8f634bfd..1ff36eb83dbe7 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -2523,6 +2523,20 @@ bool ExprEngine::replayWithoutInlining(ExplodedNode *N,
   return true;
 }
 
+/// Return the innermost location context which is inlined at `Node`, unless
+/// it's the top-level (entry point) location context.
+static const LocationContext *getInlinedLocationContext(ExplodedNode *Node,
+                                                        ExplodedGraph &G) {
+  const LocationContext *CalleeLC = Node->getLocation().getLocationContext();
+  const LocationContext *RootLC =
+      (*G.roots_begin())->getLocation().getLocationContext();
+
+  if (CalleeLC->getStackFrame() == RootLC->getStackFrame())
+    return nullptr;
+
+  return CalleeLC;
+}
+
 /// Block entrance.  (Update counters).
 void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
                                          NodeBuilderWithSinks &nodeBuilder,
@@ -2570,21 +2584,24 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
     const ExplodedNode *Sink =
                    nodeBuilder.generateSink(Pred->getState(), Pred, &tag);
 
-    // Check if we stopped at the top level function or not.
-    // Root node should have the location context of the top most function.
-    const LocationContext *CalleeLC = Pred->getLocation().getLocationContext();
-    const LocationContext *CalleeSF = CalleeLC->getStackFrame();
-    const LocationContext *RootLC =
-                        (*G.roots_begin())->getLocation().getLocationContext();
-    if (RootLC->getStackFrame() != CalleeSF) {
-      Engine.FunctionSummaries->markReachedMaxBlockCount(CalleeSF->getDecl());
+    if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+      // FIXME: This will unconditionally prevent inlining this function (even
+      // from other entrypoints), which is not a reasonable heuristic: even if
+      // we reached max block count on this particular execution path, there
+      // may be other execution paths (especially with other parametrizations)
+      // where the analyzer can reach the end of the function (so there is no
+      // natural reason to avoid inlining it). However, disabling this would
+      // significantly increase the analysis time (because more entrypoints
+      // would exhaust their allocated budget), so it must be compensated by a
+      // different (more reasonable) reduction of analysis scope.
+      Engine.FunctionSummaries->markShouldNotInline(
+          LC->getStackFrame()->getDecl());
 
       // Re-run the call evaluation without inlining it, by storing the
       // no-inlining policy in the state and enqueuing the new work item on
       // the list. Replay should almost never fail. Use the stats to catch it
       // if it does.
-      if ((!AMgr.options.NoRetryExhausted &&
-           replayWithoutInlining(Pred, CalleeLC)))
+      if ((!AMgr.options.NoRetryExhausted && replayWithoutInlining(Pred, LC)))
         return;
       NumMaxBlockCountReachedInInlined++;
     } else
@@ -2856,8 +2873,29 @@ void ExprEngine::processBranch(
       // conflicts with the widen-loop analysis option (which is off by
       // default). If we intend to support and stabilize the loop widening,
       // we must ensure that it 'plays nicely' with this logic.
-      if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops)
+      if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) {
         Builder.generateNode(StTrue, true, PredN);
+      } else if (AMgr.options.LegacyInliningPrevention) {
+        // FIXME: There is an ancient and very arbitrary heuristic in
+        // `ExprEngine::processCFGBlockEntrance` which prevents all further
+        // inlining of a function if it finds an execution path within that
+        // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a
+        // `analyzer-max-loop`, by default four iterations in a loop). Adding
+        // this "don't assume third iteration" logic significantly increased
+        // the analysis runtime on some inputs because less functions were
+        // arbitrarily excluded from being inlined, so more entrypoints used
+        // up their full allocated budget. As a hacky compensation for this,
+        // here we apply the "should not inline" mark in cases when the loop
+        // could potentially reach the `MaxBlockVisitOnPath` limit without the
+        // "don't assume third iteration" logic. This slightly overcompensates
+        // (activates if the third iteration can be entered, and will not
+        // recognize cases where the fourth iteration would't be completed), but
+        // should be good enough for practical purposes.
+        if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+          Engine.FunctionSummaries->markShouldNotInline(
+              LC->getStackFrame()->getDecl());
+        }
+      }
     }
 
     if (StFalse) {
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index 80cad54b039f4..42c622f2cd51b 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -92,6 +92,7 @@
 // CHECK-NEXT: inline-lambdas = true
 // CHECK-NEXT: ipa = dynamic-bifurcate
 // CHECK-NEXT: ipa-always-inline-size = 3
+// CHECK-NEXT: legacy-inlining-prevention = true
 // CHECK-NEXT: max-inlinable-size = 100
 // CHECK-NEXT: max-nodes = 225000
 // CHECK-NEXT: max-symbol-complexity = 35
diff --git a/clang/test/Analysis/legacy-inlining-prevention.c b/clang/test/Analysis/legacy-inlining-prevention.c
new file mode 100644
index 0000000000000..976e2ba6ddb5d
--- /dev/null
+++ b/clang/test/Analysis/legacy-inlining-prevention.c
@@ -0,0 +1,198 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -verify=expected,default %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config legacy-inlining-prevention=false -verify=expected,disabled %s
+
+int getNum(void); // Get an opaque number.
+
+void clang_analyzer_numTimesReached(void);
+void clang_analyzer_dump(int arg);
+
+//-----------------------------------------------------------------------------
+// Simple case: inlined function never reaches `analyzer-max-loop`.
+
+int inner_simple(void) {
+  clang_analyzer_numTimesReached(); // expected-warning {{2}}
+  return 42;
+}
+
+int outer_simple(void) {
+  int x = inner_simple();
+  int y = inner_simple();
+  return 53 / (x - y); // expected-warning {{Division by zero}}
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function always reaches `analyzer-max-loop`.
+
+int inner_fixed_loop_1(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_numTimesReached(); // no-warning
+  return 42;
+}
+
+int outer_fixed_loop_1(void) {
+  int x = inner_fixed_loop_1();
+  int y = inner_fixed_loop_1();
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function always reaches `analyzer-max-loop`; inlining is prevented
+// even for different entry points.
+// This test uses `clang_analyzer_dump` and distinct `arg` values because
+// `clang_analyzer_numTimesReached` only counts the paths reaching that node
+// during the analysis of one particular entry point, so it cannot distinguish
+// "two entry points reached this, both with one path" (where the two reports
+// are unified as duplicates by the generic report postprocessing) and "one
+// entry point reached this with one path" (where naturally nothing shows that
+// the second entry point _tried_ to reach it).
+
+int inner_fixed_loop_2(int arg) {
+  // Identical copy of inner_fixed_loop_1
+  int i;
+  clang_analyzer_dump(arg); // expected-warning {{2}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_dump(arg); // no-warning
+  return 42;
+}
+
+int outer_1_fixed_loop_2(void) {
+  return inner_fixed_loop_2(1);
+}
+
+int outer_2_fixed_loop_2(void) {
+  return inner_fixed_loop_2(2);
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function reaches `analyzer-max-loop` only in its second call. The
+// function is inlined twice but the second call doesn't finish and ends up
+// being conservatively evaluated.
+
+int inner_parametrized_loop_1(int count) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{2}}
+  for (i = 0; i < count; i++);
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  return 42;
+}
+
+int outer_parametrized_loop_1(void) {
+  int x = inner_parametrized_loop_1(2);
+  int y = inner_parametrized_loop_1(10);
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function reaches `analyzer-max-loop` on its first call, so the
+// second call isn't inlined (although it could be fully evaluated).
+
+int inner_parametrized_loop_2(int count) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  for (i = 0; i < count; i++);
+  clang_analyzer_numTimesReached(); // no-warning
+  return 42;
+}
+
+int outer_parametrized_loop_2(void) {
+  int y = inner_parametrized_loop_2(10);
+  int x = inner_parametrized_loop_2(2);
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function may or may not reach `analyzer-max-loop` depending on an
+// opaque check before the loop. This is very similar to the "fixed loop"
+// cases: the function is placed on the "don't inline" list when any execution
+// path reaches `analyzer-max-loop` (even if other execution paths reach the
+// end of the function).
+
+int inner_conditional_loop(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  if (getNum() == 777) {
+    for (i = 0; i < 10; i++);
+  }
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  return 42;
+}
+
+int outer_1_conditional_loop(void) {
+  return inner_conditional_loop();
+}
+
+int outer_2_conditional_loop(void) {
+  return inner_conditional_loop();
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function executes an opaque loop that may or may not reach
+// `analyzer-max-loop`. Historically, before the "don't assume third iteration"
+// commit (bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849) this worked like the
+// `conditional_loop` cases: the analyzer was able to find a path reaching
+// `analyzer-max-loop` so inlining was disabled. After that commit the analyzer
+// does not _assume_ a third (or later) iteration (i.e. does not enter those
+// iterations if the loop condition is an unknown value), so e.g. this test
+// function does not reach `analyzer-max-loop` iterations and the inlining is
+// not disabled.
+// Unfortunately this change significantly increased the workload and
+// runtime of the analyzer (more entry points used up their budget), so the
+// option `legacy-inlining-prevention` was introduced and enabled by default to
+// suppress the inlining in situations where the "don't assume third iteration"
+// logic activates.
+// This testcase demonstrate that the inlining is prevented with the default
+// `legacy-inlining-prevention=true` config, but is not prevented when this
+// option is disabled (set to false).
+
+int inner_opaque_loop_1(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // default-warning {{1}} disabled-warning {{2}}
+  for (i = 0; i < getNum(); i++);
+  return i;
+}
+
+int outer_opaque_loop_1(void) {
+  int iterCount = inner_opaque_loop_1();
+
+  // The first call to `inner_opaque_loop_1()` splits three execution paths that
+  // differ in the number of performed iterations (0, 1 or 2). The function
+  // `inner_opaque_loop_1` is added to the "do not inline this" list when the
+  // path that performed two iterations tries to enter the third iteration (and
+  // the "don't assume third iteration" logic prevents this) -- but the other
+  // two paths (which performed 0 and 1 iterations) would reach and inline the
+  // second `inner_opaque_loop_1()` before this would happen (because the
+  // default traversal is a complex heuristic that happens to prefer this). The
+  // following `if` will discard these "early exit" paths to highlight the
+  // difference between the default and disabled state:
+  if (iterCount < 2)
+    return 0;
+
+  return inner_opaque_loop_1();
+}
+
+//-----------------------------------------------------------------------------
+// Another less contrived testcase that demonstrates the difference between the
+// enabled (default) and disabled state of `legacy-inlining-prevention`.
+// Here the two calls to `inner_opaque_loop_2()` are in different entry points
+// so the first call is fully analyzed (and can put the function on the "do
+// not inline" list) before reaching the second call.
+// This test uses `clang_analyzer_dump` because (as explained in an earlier
+// comment block) `clang_analyzer_numTimesReached` is not suitable for counting
+// visits from separate entry points.
+
+int inner_opaque_loop_2(int arg) {
+  int i;
+  clang_analyzer_dump(arg); // default-warning {{2}}
+                            // disabled-warning@-1 {{1}} disabled-warning@-1 {{2}}
+  for (i = 0; i < getNum(); i++);
+  return i;
+}
+
+int outer_1_opaque_loop_2(void) {
+  return inner_opaque_loop_2(1);
+}
+int outer_2_opaque_loop(void) {
+  return inner_opaque_loop_2(2);
+}
diff --git a/clang/test/Analysis/loop-unrolling.cpp b/clang/test/Analysis/loop-unrolling.cpp
index bf05a7739ce48..ebae81e000c7a 100644
--- a/clang/test/Analysis/loop-unrolling.cpp
+++ b/clang/test/Analysis/loop-unrolling.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify -std=c++14 -DDFS=1 %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify=expected,default -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify=expected,dfs -std=c++14 %s
 
 void clang_analyzer_numTimesReached();
 void clang_analyzer_warnIfReached();
@@ -337,6 +337,7 @@ int nested_both_unrolled() {
 }
 
 int simple_known_bound_loop() {
+  // Iteration count visible: can be unrolled and fully executed.
   for (int i = 2; i < 12; i++) {
     // This function is inlined in nested_inlined_unroll1()
     clang_analyzer_numTimesReached(); // expected-warning {{90}}
@@ -345,27 +346,42 @@ int simple_known_bound_loop() {
 }
 
 int simple_unknown_bound_loop() {
+  // Iteration count unknown: unrolling won't happen and the execution will be
+  // split two times:
+  // (1) split between skipped loop (immediate exit) and entering the loop
+  // (2) split between exit after 1 iteration and entering the second iteration
+  // After these there is no third state split because the "don't assume third
+  // iteration" logic in `ExprEngine::processBranch` prevents it; but the
+  // `legacy-inlining-prevention` logic will put this function onto the list of
+  // functions that may not be inlined in the future.
+  // The exploration strategy apparently influences the number of times this
+  // function can be inlined before it's placed on the "don't inline" list.
   for (int i = 2; i < getNum(); i++) {
-    clang_analyzer_numTimesReached(); // expected-warning {{8}}
+    clang_analyzer_numTimesReached(); // default-warning {{4}} dfs-warning {{8}}
   }
   return 0;
 }
 
 int nested_inlined_unroll1() {
+  // Here the analyzer can unroll and fully execute both the outer loop and the
+  // inner loop within simple_known_bound_loop().
   int k;
   for (int i = 0; i < 9; i++) {
     clang_analyzer_numTimesReached(); // expected-warning {{9}}
-    k = simple_known_bound_loop();    // no reevaluation without inlining
+    k = simple_known_bound_loop();
   }
   int a = 22 / k; // expected-warning {{Division by zero}}
   return 0;
 }
 
 int nested_inlined_no_unroll1() {
+  // Here no unrolling happens and we only run `analyzer-max-loop` (= 4)
+  // iterations of the loop within this function, but some state splits happen
+  // in `simple_unknown_bound_loop()` calls.
   int k;
-  for (int i = 0; i < 9; i++) {
-    clang_analyzer_numTimesReached(); // expected-warning {{10}}
-    k = simple_unknown_bound_loop();  // reevaluation without inlining, splits the state as well
+  for (int i = 0; i < 40; i++) {
+    clang_analyzer_numTimesReached(); // default-warning {{9}} dfs-warning {{12}}
+    k = simple_unknown_bound_loop(); 
   }
   int a = 22 / k; // no-warning
   return 0;

llvmbot · 2025-04-22T15:53:18Z

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Donát Nagy (NagyDonat)

Changes

Recently some users reported that the runtime of the static analyzer drastically increased on their codebase when they upgraded to a more recent (sligthly patched, internal) clang version: some of their translation units got more than +600% runtime (yes, with two zeroes) and IIUC the average increase was also high.

Bisection revealed that the bulk of the increase was caused by my earlier commit bb27d5e ("Don't assume third iteration in loops") and I verified that the slowdown affects the downstream clang. (I used tmux as an arbitrary easy-to-analyze open source project and measured that reverting this commit reduces the total runtime by more than 50% compared to a recent main revision.)

Further profiling and investigation proved that this increase is caused by an increase of analysis scope because there was an arbitrary heuristic that placed functions on a "don't inline this" blacklist if they reached the -analyzer-max-loop limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.)

I feel that this heuristic for the "don't inline" blacklist is unjustified, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope.

However, the brutal jump of analysis runtime significantly degrades the user experience, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of opaque loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it).

Investigating this issue was a team effort: I'm grateful to Endre Fülöp (gamesh411) who did the bisection and shared his time measurement setup, and Gábor Tóthvári who helped me in profiling.

My preliminary measurements confirm that this commit does improve the runtime: I checked the analysis of window-copy.c from tmux where the analysis takes roughly:

3 minutes on a recent main revision (the parent of this commit);
50 seconds if the "Don't assume third iteration" commit is reverted;
40 seconds if this commit is applied.
In the upcoming days I'll do more accurate time measurements and check the impact of this change on the quality and quantity of the results.

Full diff: https://github.com/llvm/llvm-project/pull/136720.diff

6 Files Affected:

(modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+14)
(modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h (-4)
(modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+49-11)
(modified) clang/test/Analysis/analyzer-config.c (+1)
(added) clang/test/Analysis/legacy-inlining-prevention.c (+198)
(modified) clang/test/Analysis/loop-unrolling.cpp (+23-7)

diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index f9f22a9ced650..0ac85b14147e0 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -414,6 +414,20 @@ ANALYZER_OPTION(
     "any target.",
     true)
 
+ANALYZER_OPTION(
+    bool, LegacyInliningPrevention, "legacy-inlining-prevention",
+    "If enabled, the analyzer puts functions on a \"do not inline this\" list "
+    "if it finds an execution path within that function that may potentially "
+    "perform 'analyzer-max-loop' (= 4 by default) iterations in a loop. "
+    "Note that functions that _definitely_ reach the loop limit on some "
+    "execution path are currently marked as \"do not inline\" even if this "
+    "option is disabled (but this may change in future versions). This option "
+    "is a dumb and arbitrary restriction on inlining, but disabling it would "
+    "significantly increase the analysis workload (and the time taken) "
+    "compared to older clang versions because more top-level functions can "
+    "use up their 'max-nodes' limit if inlining is not restricted.",
+    true)
+
 //===----------------------------------------------------------------------===//
 // Unsigned analyzer options.
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
index 3ee0d229cfc29..761395260a0cf 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h
@@ -81,10 +81,6 @@ class FunctionSummariesTy {
     I->second.MayInline = 0;
   }
 
-  void markReachedMaxBlockCount(const Decl *D) {
-    markShouldNotInline(D);
-  }
-
   std::optional<bool> mayInline(const Decl *D) {
     MapTy::const_iterator I = Map.find(D);
     if (I != Map.end() && I->second.InlineChecked)
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 86e2e8f634bfd..1ff36eb83dbe7 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -2523,6 +2523,20 @@ bool ExprEngine::replayWithoutInlining(ExplodedNode *N,
   return true;
 }
 
+/// Return the innermost location context which is inlined at `Node`, unless
+/// it's the top-level (entry point) location context.
+static const LocationContext *getInlinedLocationContext(ExplodedNode *Node,
+                                                        ExplodedGraph &G) {
+  const LocationContext *CalleeLC = Node->getLocation().getLocationContext();
+  const LocationContext *RootLC =
+      (*G.roots_begin())->getLocation().getLocationContext();
+
+  if (CalleeLC->getStackFrame() == RootLC->getStackFrame())
+    return nullptr;
+
+  return CalleeLC;
+}
+
 /// Block entrance.  (Update counters).
 void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
                                          NodeBuilderWithSinks &nodeBuilder,
@@ -2570,21 +2584,24 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
     const ExplodedNode *Sink =
                    nodeBuilder.generateSink(Pred->getState(), Pred, &tag);
 
-    // Check if we stopped at the top level function or not.
-    // Root node should have the location context of the top most function.
-    const LocationContext *CalleeLC = Pred->getLocation().getLocationContext();
-    const LocationContext *CalleeSF = CalleeLC->getStackFrame();
-    const LocationContext *RootLC =
-                        (*G.roots_begin())->getLocation().getLocationContext();
-    if (RootLC->getStackFrame() != CalleeSF) {
-      Engine.FunctionSummaries->markReachedMaxBlockCount(CalleeSF->getDecl());
+    if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+      // FIXME: This will unconditionally prevent inlining this function (even
+      // from other entrypoints), which is not a reasonable heuristic: even if
+      // we reached max block count on this particular execution path, there
+      // may be other execution paths (especially with other parametrizations)
+      // where the analyzer can reach the end of the function (so there is no
+      // natural reason to avoid inlining it). However, disabling this would
+      // significantly increase the analysis time (because more entrypoints
+      // would exhaust their allocated budget), so it must be compensated by a
+      // different (more reasonable) reduction of analysis scope.
+      Engine.FunctionSummaries->markShouldNotInline(
+          LC->getStackFrame()->getDecl());
 
       // Re-run the call evaluation without inlining it, by storing the
       // no-inlining policy in the state and enqueuing the new work item on
       // the list. Replay should almost never fail. Use the stats to catch it
       // if it does.
-      if ((!AMgr.options.NoRetryExhausted &&
-           replayWithoutInlining(Pred, CalleeLC)))
+      if ((!AMgr.options.NoRetryExhausted && replayWithoutInlining(Pred, LC)))
         return;
       NumMaxBlockCountReachedInInlined++;
     } else
@@ -2856,8 +2873,29 @@ void ExprEngine::processBranch(
       // conflicts with the widen-loop analysis option (which is off by
       // default). If we intend to support and stabilize the loop widening,
       // we must ensure that it 'plays nicely' with this logic.
-      if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops)
+      if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) {
         Builder.generateNode(StTrue, true, PredN);
+      } else if (AMgr.options.LegacyInliningPrevention) {
+        // FIXME: There is an ancient and very arbitrary heuristic in
+        // `ExprEngine::processCFGBlockEntrance` which prevents all further
+        // inlining of a function if it finds an execution path within that
+        // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a
+        // `analyzer-max-loop`, by default four iterations in a loop). Adding
+        // this "don't assume third iteration" logic significantly increased
+        // the analysis runtime on some inputs because less functions were
+        // arbitrarily excluded from being inlined, so more entrypoints used
+        // up their full allocated budget. As a hacky compensation for this,
+        // here we apply the "should not inline" mark in cases when the loop
+        // could potentially reach the `MaxBlockVisitOnPath` limit without the
+        // "don't assume third iteration" logic. This slightly overcompensates
+        // (activates if the third iteration can be entered, and will not
+        // recognize cases where the fourth iteration would't be completed), but
+        // should be good enough for practical purposes.
+        if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+          Engine.FunctionSummaries->markShouldNotInline(
+              LC->getStackFrame()->getDecl());
+        }
+      }
     }
 
     if (StFalse) {
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index 80cad54b039f4..42c622f2cd51b 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -92,6 +92,7 @@
 // CHECK-NEXT: inline-lambdas = true
 // CHECK-NEXT: ipa = dynamic-bifurcate
 // CHECK-NEXT: ipa-always-inline-size = 3
+// CHECK-NEXT: legacy-inlining-prevention = true
 // CHECK-NEXT: max-inlinable-size = 100
 // CHECK-NEXT: max-nodes = 225000
 // CHECK-NEXT: max-symbol-complexity = 35
diff --git a/clang/test/Analysis/legacy-inlining-prevention.c b/clang/test/Analysis/legacy-inlining-prevention.c
new file mode 100644
index 0000000000000..976e2ba6ddb5d
--- /dev/null
+++ b/clang/test/Analysis/legacy-inlining-prevention.c
@@ -0,0 +1,198 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -verify=expected,default %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config legacy-inlining-prevention=false -verify=expected,disabled %s
+
+int getNum(void); // Get an opaque number.
+
+void clang_analyzer_numTimesReached(void);
+void clang_analyzer_dump(int arg);
+
+//-----------------------------------------------------------------------------
+// Simple case: inlined function never reaches `analyzer-max-loop`.
+
+int inner_simple(void) {
+  clang_analyzer_numTimesReached(); // expected-warning {{2}}
+  return 42;
+}
+
+int outer_simple(void) {
+  int x = inner_simple();
+  int y = inner_simple();
+  return 53 / (x - y); // expected-warning {{Division by zero}}
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function always reaches `analyzer-max-loop`.
+
+int inner_fixed_loop_1(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_numTimesReached(); // no-warning
+  return 42;
+}
+
+int outer_fixed_loop_1(void) {
+  int x = inner_fixed_loop_1();
+  int y = inner_fixed_loop_1();
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function always reaches `analyzer-max-loop`; inlining is prevented
+// even for different entry points.
+// This test uses `clang_analyzer_dump` and distinct `arg` values because
+// `clang_analyzer_numTimesReached` only counts the paths reaching that node
+// during the analysis of one particular entry point, so it cannot distinguish
+// "two entry points reached this, both with one path" (where the two reports
+// are unified as duplicates by the generic report postprocessing) and "one
+// entry point reached this with one path" (where naturally nothing shows that
+// the second entry point _tried_ to reach it).
+
+int inner_fixed_loop_2(int arg) {
+  // Identical copy of inner_fixed_loop_1
+  int i;
+  clang_analyzer_dump(arg); // expected-warning {{2}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_dump(arg); // no-warning
+  return 42;
+}
+
+int outer_1_fixed_loop_2(void) {
+  return inner_fixed_loop_2(1);
+}
+
+int outer_2_fixed_loop_2(void) {
+  return inner_fixed_loop_2(2);
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function reaches `analyzer-max-loop` only in its second call. The
+// function is inlined twice but the second call doesn't finish and ends up
+// being conservatively evaluated.
+
+int inner_parametrized_loop_1(int count) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{2}}
+  for (i = 0; i < count; i++);
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  return 42;
+}
+
+int outer_parametrized_loop_1(void) {
+  int x = inner_parametrized_loop_1(2);
+  int y = inner_parametrized_loop_1(10);
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function reaches `analyzer-max-loop` on its first call, so the
+// second call isn't inlined (although it could be fully evaluated).
+
+int inner_parametrized_loop_2(int count) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  for (i = 0; i < count; i++);
+  clang_analyzer_numTimesReached(); // no-warning
+  return 42;
+}
+
+int outer_parametrized_loop_2(void) {
+  int y = inner_parametrized_loop_2(10);
+  int x = inner_parametrized_loop_2(2);
+  return 53 / (x - y); // no-warning
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function may or may not reach `analyzer-max-loop` depending on an
+// opaque check before the loop. This is very similar to the "fixed loop"
+// cases: the function is placed on the "don't inline" list when any execution
+// path reaches `analyzer-max-loop` (even if other execution paths reach the
+// end of the function).
+
+int inner_conditional_loop(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  if (getNum() == 777) {
+    for (i = 0; i < 10; i++);
+  }
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  return 42;
+}
+
+int outer_1_conditional_loop(void) {
+  return inner_conditional_loop();
+}
+
+int outer_2_conditional_loop(void) {
+  return inner_conditional_loop();
+}
+
+//-----------------------------------------------------------------------------
+// Inlined function executes an opaque loop that may or may not reach
+// `analyzer-max-loop`. Historically, before the "don't assume third iteration"
+// commit (bb27d5e5c6b194a1440b8ac4e5ace68d0ee2a849) this worked like the
+// `conditional_loop` cases: the analyzer was able to find a path reaching
+// `analyzer-max-loop` so inlining was disabled. After that commit the analyzer
+// does not _assume_ a third (or later) iteration (i.e. does not enter those
+// iterations if the loop condition is an unknown value), so e.g. this test
+// function does not reach `analyzer-max-loop` iterations and the inlining is
+// not disabled.
+// Unfortunately this change significantly increased the workload and
+// runtime of the analyzer (more entry points used up their budget), so the
+// option `legacy-inlining-prevention` was introduced and enabled by default to
+// suppress the inlining in situations where the "don't assume third iteration"
+// logic activates.
+// This testcase demonstrate that the inlining is prevented with the default
+// `legacy-inlining-prevention=true` config, but is not prevented when this
+// option is disabled (set to false).
+
+int inner_opaque_loop_1(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // default-warning {{1}} disabled-warning {{2}}
+  for (i = 0; i < getNum(); i++);
+  return i;
+}
+
+int outer_opaque_loop_1(void) {
+  int iterCount = inner_opaque_loop_1();
+
+  // The first call to `inner_opaque_loop_1()` splits three execution paths that
+  // differ in the number of performed iterations (0, 1 or 2). The function
+  // `inner_opaque_loop_1` is added to the "do not inline this" list when the
+  // path that performed two iterations tries to enter the third iteration (and
+  // the "don't assume third iteration" logic prevents this) -- but the other
+  // two paths (which performed 0 and 1 iterations) would reach and inline the
+  // second `inner_opaque_loop_1()` before this would happen (because the
+  // default traversal is a complex heuristic that happens to prefer this). The
+  // following `if` will discard these "early exit" paths to highlight the
+  // difference between the default and disabled state:
+  if (iterCount < 2)
+    return 0;
+
+  return inner_opaque_loop_1();
+}
+
+//-----------------------------------------------------------------------------
+// Another less contrived testcase that demonstrates the difference between the
+// enabled (default) and disabled state of `legacy-inlining-prevention`.
+// Here the two calls to `inner_opaque_loop_2()` are in different entry points
+// so the first call is fully analyzed (and can put the function on the "do
+// not inline" list) before reaching the second call.
+// This test uses `clang_analyzer_dump` because (as explained in an earlier
+// comment block) `clang_analyzer_numTimesReached` is not suitable for counting
+// visits from separate entry points.
+
+int inner_opaque_loop_2(int arg) {
+  int i;
+  clang_analyzer_dump(arg); // default-warning {{2}}
+                            // disabled-warning@-1 {{1}} disabled-warning@-1 {{2}}
+  for (i = 0; i < getNum(); i++);
+  return i;
+}
+
+int outer_1_opaque_loop_2(void) {
+  return inner_opaque_loop_2(1);
+}
+int outer_2_opaque_loop(void) {
+  return inner_opaque_loop_2(2);
+}
diff --git a/clang/test/Analysis/loop-unrolling.cpp b/clang/test/Analysis/loop-unrolling.cpp
index bf05a7739ce48..ebae81e000c7a 100644
--- a/clang/test/Analysis/loop-unrolling.cpp
+++ b/clang/test/Analysis/loop-unrolling.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify -std=c++14 -DDFS=1 %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify=expected,default -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify=expected,dfs -std=c++14 %s
 
 void clang_analyzer_numTimesReached();
 void clang_analyzer_warnIfReached();
@@ -337,6 +337,7 @@ int nested_both_unrolled() {
 }
 
 int simple_known_bound_loop() {
+  // Iteration count visible: can be unrolled and fully executed.
   for (int i = 2; i < 12; i++) {
     // This function is inlined in nested_inlined_unroll1()
     clang_analyzer_numTimesReached(); // expected-warning {{90}}
@@ -345,27 +346,42 @@ int simple_known_bound_loop() {
 }
 
 int simple_unknown_bound_loop() {
+  // Iteration count unknown: unrolling won't happen and the execution will be
+  // split two times:
+  // (1) split between skipped loop (immediate exit) and entering the loop
+  // (2) split between exit after 1 iteration and entering the second iteration
+  // After these there is no third state split because the "don't assume third
+  // iteration" logic in `ExprEngine::processBranch` prevents it; but the
+  // `legacy-inlining-prevention` logic will put this function onto the list of
+  // functions that may not be inlined in the future.
+  // The exploration strategy apparently influences the number of times this
+  // function can be inlined before it's placed on the "don't inline" list.
   for (int i = 2; i < getNum(); i++) {
-    clang_analyzer_numTimesReached(); // expected-warning {{8}}
+    clang_analyzer_numTimesReached(); // default-warning {{4}} dfs-warning {{8}}
   }
   return 0;
 }
 
 int nested_inlined_unroll1() {
+  // Here the analyzer can unroll and fully execute both the outer loop and the
+  // inner loop within simple_known_bound_loop().
   int k;
   for (int i = 0; i < 9; i++) {
     clang_analyzer_numTimesReached(); // expected-warning {{9}}
-    k = simple_known_bound_loop();    // no reevaluation without inlining
+    k = simple_known_bound_loop();
   }
   int a = 22 / k; // expected-warning {{Division by zero}}
   return 0;
 }
 
 int nested_inlined_no_unroll1() {
+  // Here no unrolling happens and we only run `analyzer-max-loop` (= 4)
+  // iterations of the loop within this function, but some state splits happen
+  // in `simple_unknown_bound_loop()` calls.
   int k;
-  for (int i = 0; i < 9; i++) {
-    clang_analyzer_numTimesReached(); // expected-warning {{10}}
-    k = simple_unknown_bound_loop();  // reevaluation without inlining, splits the state as well
+  for (int i = 0; i < 40; i++) {
+    clang_analyzer_numTimesReached(); // default-warning {{9}} dfs-warning {{12}}
+    k = simple_unknown_bound_loop(); 
   }
   int a = 22 / k; // no-warning
   return 0;

clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h

NagyDonat · 2025-04-22T16:01:01Z

clang/test/Analysis/loop-unrolling.cpp

-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify -std=c++14 -DDFS=1 %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true -verify=expected,default -std=c++14 -analyzer-config exploration_strategy=unexplored_first_queue %s
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -analyzer-config unroll-loops=true,cfg-loopexit=true,exploration_strategy=dfs -verify=expected,dfs -std=c++14 %s


I'm leaning towards eliminating the alternative RUN: line that uses dfs because the only difference between dfs and the default exploration algorithm is a "control group" testcase where unroll-loops does not activate.

I'll probably do so in a separate follow-up commit.

As you wish.

balazs-benics-sonarsource

Thank you for investigating this. At Sonar, we have not yet started the upgrade to clang-20. I suppose, you already did then, thus found this regression on trunk.

Maybe we should also reflect of the quality control of our submissions of core changes too, but let's leave that after the PR is discussed, and we put out the fire.

My impression is that we shouldn't have a new flag for this, we should unconditionally apply this for now. Did you think about this?

I left a few other comments about testing but overall I'm all for this change.
Thanks again!

clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

balazs-benics-sonarsource · 2025-04-23T11:33:13Z

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

+      } else if (AMgr.options.LegacyInliningPrevention) {
+        // FIXME: There is an ancient and very arbitrary heuristic in
+        // `ExprEngine::processCFGBlockEntrance` which prevents all further
+        // inlining of a function if it finds an execution path within that
+        // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a
+        // `analyzer-max-loop`, by default four iterations in a loop). Adding
+        // this "don't assume third iteration" logic significantly increased
+        // the analysis runtime on some inputs because less functions were
+        // arbitrarily excluded from being inlined, so more entrypoints used
+        // up their full allocated budget. As a hacky compensation for this,
+        // here we apply the "should not inline" mark in cases when the loop
+        // could potentially reach the `MaxBlockVisitOnPath` limit without the
+        // "don't assume third iteration" logic. This slightly overcompensates
+        // (activates if the third iteration can be entered, and will not
+        // recognize cases where the fourth iteration would't be completed), but
+        // should be good enough for practical purposes.
+        if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+          Engine.FunctionSummaries->markShouldNotInline(
+              LC->getStackFrame()->getDecl());
+        }
+      }


How about if we would apply this chunk unconditionally? I don't think anyone would want to regress 6x voluntarily.

I put this behind a flag (which restores the original behavior by default) because it is not just "wasted" runtime, but offers a tradeoff between runtime and the depth/accuracy of the analysis. There are several other options like max-nodes or the high-level mode setting that offer analogous tradeoffs. However, I will refine this judgement when I check the result produced with and without this patch -- and if the difference is negligible, I'm more open to applying this chunk unconditionally.

On a long term we will probably need a heuristic to "avoid inlining large functions", but I strongly suspect that this loop-based hacky condition is not the "right" (natural, straightforward, robust etc.) way of formalizing this idea. I feel that the current bounds of the analysis scope are almost "accidental" (or at least I don't know about any measurements that justify e.g. the default max-nodes budget value or the various "don't inline if ..." conditions) and I hope that somewhen I could spend some time on understanding and improving this area.

Right now the outline of my plan roughly looks like:

(1) Publish this workaround ASAP to restore the "old normal" runtimes and minimize the disruption.

(2) Investigate the (runtime) costs and benefits of various inlining or exploration strategies, and publish alternative heuristics that express "large functions" in different, more natural terms (e.g. block count, total # of nodes created etc. instead of "did it reach four iterations in a loop").

(3) Adjust the default config: disable this legacy-inlining-prevention by default and instead enable other heuristics which on average guarantee similar total runtime, but perhaps provide better coverage / more useful results / better robustness, ...

Even then the users would be able to return to this legacy config if their project is an outlier where this particular heuristic was especially aggressive and they liked it that way.

My point is, if Ericsson will ship this true, and we are also going to ship this with true, then who would want to disable this? Do you expect users disabling this? Would anyone complain if we would just land this fix without an option for opting out?
Speaking of arbitrary cut heuristics, maybe they are arbitrary, but that should only make use more careful of changing them because continuity itself is valuable - but I'm not opposed, just cautions.

Also I am not sure we should have legacy in the name of something that is enabled by default.

My point is, if Ericsson will ship this true, and we are also going to ship this with true, then who would want to disable this? Do you expect users disabling this?

Good point.

Would anyone complain if we would just land this fix without an option for opting out?

I felt that the logic introduced by this commit is so ugly that it should not be unconditionally part of the analyzer behavior (especially if it is git blamed to me...). By introducing this option I wanted to express that "this is not a stable permanent part of the analyzer, please don't rely on its existence" -- and I felt that yet another "FIXME" comment is not enough for this hack.

However, I can accept removing this option (and making the hack unconditional) if you strongly prefer that.

Also I am not sure we should have legacy in the name of something that is enabled by default.

When I wrote this change, I hoped that this config option could be switched off and deprecated in the foreseeable future (in fact it was disabled by default in the first draft of the commit).

Also, this heuristic is so crazy that I couldn't give a descriptive option name (do-not-inline-function-if-it-would-assume-third-iteration-in-a-loop is too much even compared to the other option names), so I named it "legacy" to say that it's connected to some old awkward heuristic.

balazs-benics-sonarsource · 2025-04-23T11:34:44Z

clang/test/Analysis/legacy-inlining-prevention.c

+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_numTimesReached(); // no-warning


Suggested change

clang_analyzer_numTimesReached(); // no-warning

clang_analyzer_numTimesReached(); // FIXME: It should be reachable

I would be careful with this FIXME -- if somebody follows it and fixes this issue (e.g. by enabling unroll-loops by default) we could get yet another large slowdown when less loops hit the analyzer-max-loop limit, less functions end on the "don't inline" list and more entry points spend their budget.

Obviously in an ideal world an empty loop like this shouldn't stop the analyzer – but we must replace this awkward loop-based inlining restriction with a more robust heuristic before we can do that. (I don't want to spread "older dumber analyzer would give up here, don't inline this function" hacks in the codebase as we improve the loop handling.)

I didn't get the argument why would be a risk of having a FIXME here. It's the reviewer's and maintainer's responsibility to ensure the changes align with the long term directions of the project.
If someone would just override analyzer-max-loop for this test "to fix the FIXME", that should be rejected.
However, if we found a way to enable it by default, or have a smarter engine, then it's fine - but that would also touch files beyond this single test file.

IMO the situation is complex because:

from the POV of the users this "analysis is stopped by long loops" behavior is obviously a bug that needs to be fixed;

but from the POV of the inlining prevention heuristic (which is tested here) this is a known feature (or at least "property") of the analyzer engine which is important for the correct delineation of the scope.

You're right that these deserve a FIXME, but I'll probably write a longer FIXME block that explains the full situation instead of short inline FIXME notes.

Sounds good.

balazs-benics-sonarsource · 2025-04-23T11:36:43Z

clang/test/Analysis/legacy-inlining-prevention.c

+int inner_fixed_loop_1(void) {
+  int i;
+  clang_analyzer_numTimesReached(); // expected-warning {{1}}


You could pass a const char* from a string literal to differentiate which is inlined when observing its value using a clang_analyzer_dump() at the beginning of this inlined function. That should make it clear that for the first time its inlined, but for the second time it's not.

Ok, I can switch all testcases to using clang_analyzer_dump(arg) instead of numTimesReached to highlight which calls were inlined (and not just their count). There are already a few testcases where I use that solution (because I need it for multiple entry points), but you're right that it would be helpful in the other cases as well.

balazs-benics-sonarsource · 2025-04-23T11:37:15Z

clang/test/Analysis/legacy-inlining-prevention.c

+  int x = inner_fixed_loop_1();
+  int y = inner_fixed_loop_1();


We should have comments here that first it's inlined, second it's not because of the given heuristic.

balazs-benics-sonarsource · 2025-04-23T11:39:16Z

clang/test/Analysis/legacy-inlining-prevention.c

+int outer_fixed_loop_1(void) {
+  int x = inner_fixed_loop_1();
+  int y = inner_fixed_loop_1();
+  return 53 / (x - y); // no-warning


My problem with these no-warnings in general in this PR that it documents what the test currently does, but what they should document what the tests should/could expect.
In this case in an ideal world, we should actually get a diagnostic, thus the desired outcome is not a no-warning.
Consequently, a FIXME would be more appropriate I think.

balazs-benics-sonarsource · 2025-04-23T11:39:48Z

clang/test/Analysis/legacy-inlining-prevention.c

+  int i;
+  clang_analyzer_dump(arg); // expected-warning {{2}}
+  for (i = 0; i < 10; i++);
+  clang_analyzer_dump(arg); // no-warning


balazs-benics-sonarsource · 2025-04-23T11:51:33Z

One other note. We should backport this fix to clang-20 once it lands to main.

Note that we need to expect e.g. {{2 S32}} because the pattern {{2}} would match any other 32-bit integer as well. The last two testcases in the file were testing the same thing in different ways, so this commit only keeps the one that used `clang_analyzer_dump` (which was previously an exception).

NagyDonat · 2025-04-24T14:05:18Z

I tested the impact of this commit on various open source projects and the changes in the result counts are surprisingly small:

Project	New Reports	Resolved Reports
memcached	0 new reports	0 resolved reports
tmux	2 new reports	6 resolved reports
curl	0 new reports	0 resolved reports
twin	3 new reports	2 resolved reports
vim	11 new reports	18 resolved reports
openssl	0 new reports	1 resolved reports
sqlite	28 new reports	8 resolved reports
ffmpeg	22 new reports	23 resolved reports
postgres	3 new reports	20 resolved reports
tinyxml2	0 new reports	0 resolved reports
libwebm	1 new reports	0 resolved reports
xerces	3 new reports	5 resolved reports
bitcoin	0 new reports	0 resolved reports
protobuf	2 new reports	3 resolved reports
qtbase	27 new reports	23 resolved reports
openrct2	0 new reports	7 resolved reports

Unfortunately there was some trouble with our publicly available "demo" CodeChecker server (where we usually present the analysis results), so I had to store these results to an internal server and they are not available externally. (If somebody is interested, I can give priority to finding a way to publish the full report set.)

balazs-benics-sonarsource · 2025-04-24T14:43:53Z

Do you have data about the analysis times per file, and per analysis entry point?
Compared against the current llvm main, and also if this workaround would restore the original running times before bb27d5e?

Xazax-hun · 2025-04-24T13:49:28Z

clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def

+    "Note that functions that _definitely_ reach the loop limit on some "
+    "execution path are currently marked as \"do not inline\" even if this "
+    "option is disabled (but this may change in future versions). This option "
+    "is a dumb and arbitrary restriction on inlining, but disabling it would "


Nit: I'd remove "dumb" from the user facing string.

Good point, I was annoyed by the situation when I wrote this description, but I was already planning to rewrite these before finalizing the commit.

EDIT: done.

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

Xazax-hun · 2025-04-24T13:52:23Z

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

-      Engine.FunctionSummaries->markReachedMaxBlockCount(CalleeSF->getDecl());
+    if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+      // FIXME: This will unconditionally prevent inlining this function (even
+      // from other entrypoints), which is not a reasonable heuristic: even if


Nit: I think entry points are two distinct words.

Thanks for spotting this! For some reason I tend to automatically write "entrypoint"; before uploading the PR I fixed it in many locations, but apparently I failed to catch this one...

EDIT: fixed this and two other occurrences.

Xazax-hun · 2025-04-24T13:54:27Z

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

+      } else if (AMgr.options.LegacyInliningPrevention) {
+        // FIXME: There is an ancient and very arbitrary heuristic in
+        // `ExprEngine::processCFGBlockEntrance` which prevents all further
+        // inlining of a function if it finds an execution path within that
+        // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a
+        // `analyzer-max-loop`, by default four iterations in a loop). Adding
+        // this "don't assume third iteration" logic significantly increased
+        // the analysis runtime on some inputs because less functions were
+        // arbitrarily excluded from being inlined, so more entrypoints used
+        // up their full allocated budget. As a hacky compensation for this,
+        // here we apply the "should not inline" mark in cases when the loop
+        // could potentially reach the `MaxBlockVisitOnPath` limit without the
+        // "don't assume third iteration" logic. This slightly overcompensates
+        // (activates if the third iteration can be entered, and will not
+        // recognize cases where the fourth iteration would't be completed), but
+        // should be good enough for practical purposes.
+        if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) {
+          Engine.FunctionSummaries->markShouldNotInline(
+              LC->getStackFrame()->getDecl());
+        }
+      }


Also I am not sure we should have legacy in the name of something that is enabled by default.

NagyDonat · 2025-04-24T15:54:42Z

@balazs-benics-sonarsource Unfortunately our current CI (which runs the static analyzer via CodeChecker) cannot provide file-level or entrypoint-level timing information, but I do have runtime measurements of the full analysis (on our Jenkins node with 16 parallel threads):

Project	legacy-inlining-prevention=false	legacy-inlining-prevention=true	speedup
memcached	22.92	22.97	0%
tmux	170.30	80.04	53%
curl	133.18	123.04	8%
twin	48.47	45.71	6%
vim	168.95	155.50	8%
openssl	272.56	254.20	7%
sqlite	1046.26	842.43	19%
ffmpeg	1210.35	1178.02	3%
postgres	391.50	323.61	17%
tinyxml2	44.60	29.61	34%
libwebm	127.05	65.92	48%
xerces	230.21	172.86	25%
bitcoin	1303.82	935.25	28%
protobuf	2180.02	1950.36	11%
qtbase	3014.14	2655.59	12%
openrct2	835.05	763.86	9%

Both measurements were performed on the commit under review with different config: legacy-inlining-prevention=false (which is practically equivalent to the recent main revision a84a6f7) and legacy-inlining-prevention=true (the default in this patch). The second and third column contains the time in seconds; while the "speedup" column measures the effect of enabling this commit.

If needed, I can measure timing differences compared to other commits as well. I think the most natural comparison would be between

the commit under review with its default legacy-inlining-prevention=true mode

and

reverting bb27d5e ("Don't assume third iteration") atop a84a6f7 (the parent of the commit under review)

because that way the comparison wouldn't be influenced by other changes that were merged in the last few months. EDIT: I started a CI job to compare these two revisions, I'll post results tomorrow.

NagyDonat · 2025-04-24T16:38:27Z

I pushed two commits (including d57946b which is hidden among the earlier review comments based on its creation time), and these in theory address most review comments, with the exception of two areas:

Writing a detailed FIXME note about the "analysis is interrupted if it reaches four iterations" question.
Changing this hack from "enabled by default config" to "unconditionally enabled" -- or just renaming the config.

I'll try to address these tomorrow.

NagyDonat · 2025-04-25T09:42:37Z

Execution time comparisons between the commit under review and just reverting bb27d5e ("Don't assume third iteration") atop a84a6f7 (the parent of the commit under review):

Project	just revert	commit under review	Ratio
memcached	27.50	25.68	93.4%
tmux	79.49	73.16	92.0%
curl	133.26	117.82	88.4%
twin	54.28	48.40	89.2%
vim	221.86	165.80	74.7%
openssl	304.98	256.52	84.1%
sqlite	1039.42	870.52	83.8%
ffmpeg	1272.77	1190.92	93.6%
postgres	402.34	321.95	80.0%
tinyxml2	49.82	28.65	57.5%
libwebm	83.14	56.02	67.4%
xerces	202.53	175.92	86.9%
bitcoin	162.05	155.26	95.8%
protobuf	2367.55	1969.73	83.2%
qtbase	3022.01	2729.55	90.3%
openrct2	889.14	782.47	88.0%

🤔 The version with the commit under review is surprisingly fast and I don't exactly know why. My most plausible theory is that bb27d5e ("Don't assume third iteration") has two effects on the analysis runtime:

it inherently speeds up the analysis of loops (because some iterations are skipped);
it slows down the analysis because it affects the inlining heuristic and prevents functions from being placed on the inlining blacklist.

This would explain why just reverting that commit (which undoes both effects) produces slower analysis than applying the commit under review (which undoes the slowdown and keeps the speedup).

balazs-benics-sonarsource · 2025-04-25T09:52:07Z

🤔 The version with the commit under review is surprisingly fast and I don't exactly know why. My most plausible theory is that bb27d5e ("Don't assume third iteration") has two effects on the analysis runtime:

it inherently speeds up the analysis of loops (because some iterations are skipped);

it slows down the analysis because it affects the inlining heuristic and prevents functions from being placed on the inlining blacklist.

This would explain why just reverting that commit (which undoes both effects) produces slower analysis than applying the commit under review (which undoes the slowdown and keeps the speedup).

This makes sense, and I was expecting even back in the day. But I was shocked that sometimes intuition fails, and we didn't check the RT for that patch. Now that you did the work, it leaves me in a lot more relaxed situation. Thanks!

…prevention

NagyDonat · 2025-04-25T11:29:26Z

But I was shocked that sometimes intuition fails, and we didn't check the RT for that patch.

This is a legacy project with many skeletons hidden in various closets... The main moral of this story is probably that we should be more paranoid about measuring runtime (I'm thinking about whether we can set up some regular or semi-regular runtime measurements in our CI), but this also shows why we should work on reducing the technical debt.

NagyDonat · 2025-04-25T11:55:53Z

With the two recent commits I think I addressed all review comments except for the "on-by-default config option" vs "unconditionally active logic" question.

In this area (which was discussed in inline comments) I'm still leaning towards the currently implemented legacy-inlining-prevention config option (which is enabled by default), because this would provide very minor advantages in several areas:

The average slowdown caused by disabling this workaround seems to be around 10-15% on larger projects, which is painful, but not completely prohibitive (and I guess that it could be offset if the user e.g. reduces max-nodes by 20%).
If somebody happens to run into false negatives (or invalidation-based false positives) which are caused by this inlining prevention, we can suggest legacy-inlining-prevention=false as a workaround.
I hope that I will be able to develop other heuristics that prevent the inlining of complex functions based on more "natural" and robust conditions -- when those are available, some users may want to disable this and enable those instead. (Or, if they are proven to be better and we change the defaults, this option should remain as a disabled-by-default fallback for those who might want return to the earlier behavior.)
During the development and testing of those alternative heuristics it would be nice to have this config option to enable/disable this workaround.
Last, but perhaps not least, having a named config option will make it easier to speak about this code fragment (instead of "that hack where we disable inlining of a function if we would assume entering the third iteration in a loop").

If we keep the config option, we can also think about renaming it. I agree that legacy is a bit unfortunate in the name of an option that is on-by-default (at least for now), but I don't have a better idea. (Feel free to suggest names if you have!)

@balazs-benics-sonarsource @Xazax-hun WDYT about this area? Do you have any other question/remark?

balazs-benics-sonarsource · 2025-04-25T12:16:29Z

I can see your points. I think they indeed moves the needle slightly for having this (or a similar) flag but barely.
I'd need to think about a better flag name, but right now I'm too busy for that.

NagyDonat · 2025-04-25T17:41:47Z

I did yet another measurement where I measured the runtime of bb27d5e ("Don't assume third iteration") compared to its parent commit (i.e. the measurement that we should've done before merging it) and it turns out that on average it does not increase the runtime and tmux (where Endre observed a significant slowdown) is an extreme outlier:

Project	without bb27d5e	with bb27d5e	Ratio
memcached	23.25	22.26	95.7%
tmux	67.47	146.70	217.4%
curl	116.58	111.74	95.8%
twin	42.81	40.33	94.2%
vim	164.54	154.02	93.6%
openssl	274.13	258.72	94.4%
sqlite	902.19	943.20	104.5%
ffmpeg	1161.76	1130.22	97.3%
postgres	367.47	335.01	91.2%
tinyxml2	36.11	45.79	126.8%
libwebm	75.85	64.33	84.8%
xerces	191.27	212.58	111.1%
bitcoin	159.05	159.09	100.0%
protobuf	2249.66	2108.13	93.7%
qtbase	2796.47	2812.94	100.6%
openrct2	946.48	880.49	93.0%
sum of all projects	9575.09	9425.55	98.4%

By the way Endre Fülöp picked tmux arbitrarily as a "random small open source project that is easy to build" when he tried to reproduce the slowdown reported by the user and as it immediately reproduced a significant slowdown, we assumed that this is an universal phenomenon -- but it turns out that this was just a coincidence.

These results show that most users won't experience a brutal slowdown (especially if they have a large varied codebase where the law of large numbers guarantees average results), which reduces the urgency of this PR a bit. (And I will rewrite the commit message, clarifying that the runtime increases are mostly concentrated on individual TUs and aren't too large on average.)

NagyDonat · 2025-05-05T13:44:17Z

@balazs-benics-sonarsource Gentle ping

balazs-benics-sonarsource

Looks good. There were two points unaddressed:

Finding a name for the flag without the legacy- prefix
Find out if we can ever have multiple root nodes in an exploded graph.

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

NagyDonat · 2025-05-06T13:11:16Z

I discussed the name with Endre (@gamesh411) in person and our best idea is inline-functions-with-ambiguous-loops which would be the opposite of the currently implemented legacy-inlining-prevention (i.e. the default would be inline-functions-with-ambiguous-loops=false which is equivalent to legacy-inlining-prevention=true).

(Alternatively, inline-functions-containing-ambiguous-loops or inline-functions-containing-loop-with-ambiguous-condition would be more accurate and even longer variants of this name.)

@balazs-benics-sonarsource @ anybody else
What do you think about this name? Would this be acceptable for you?

balazs-benics-sonarsource · 2025-05-06T16:52:05Z

@Xazax-hun WDYT of the proposed alternative flag name? (as you challenged the currently proposed name)

Xazax-hun · 2025-05-07T14:04:44Z

inline-functions-with-ambiguous-loops sounds good to me. I don't think it makes sense to have legacy in the name of an on by default setting.

balazs-benics-sonarsource · 2025-05-07T14:25:21Z

LGTM, given we use the new flag name.

balazs-benics-sonarsource

Thank you!

balazs-benics-sonarsource · 2025-05-12T09:14:58Z

Thanks for the fix, I'll proceed with the backport if you still believe it's worthy. @NagyDonat

NagyDonat · 2025-05-12T17:26:22Z

This is not an automatic "every project is affected" slowdown, but on smaller outlier projects like tmux the runtime may double, so I'd guess that it's reasonable to backport this commit (although I'm not familiar with the exact conditions when a commit should be backported). Thanks for doing the backport (if you decide to do it).

…llvm#136720) Recently some users reported that they observed large increases of runtime (up to +600% on some translation units) when they upgraded to a more recent (slightly patched, internal) clang version. Bisection revealed that the bulk of this increase was probably caused by my earlier commit bb27d5e ("Don't assume third iteration in loops"). As I evaluated that earlier commit on several open source project, it turns out that on average it's runtime-neutral (or slightly helpful: it reduced the total analysis time by 1.5%) but it can cause runtime spikes on some code: in particular it more than doubled the time to analyze `tmux` (one of the smaller test projects). Further profiling and investigation proved that these spikes were caused by an _increase of analysis scope_ because there was an heuristic that placed functions on a "don't inline this" blacklist if they reached the `-analyzer-max-loop` limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.) I feel that this heuristic for the "don't inline" blacklist is unjustified and arbitrary, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope. However, the runtime increases are annoying for the users whose project is affected, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of ambiguous loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it). Investigating this issue was a team effort: I'm grateful to Endre Fülöp (gamesh411) who did the bisection and shared his time measurement setup, and Gábor Tóthvári (tigbr) who helped me in profiling. (cherry picked from commit 9600a12)

steakhal · 2025-05-12T17:57:33Z

Backport requested in #139597

…llvm#136720) Recently some users reported that they observed large increases of runtime (up to +600% on some translation units) when they upgraded to a more recent (slightly patched, internal) clang version. Bisection revealed that the bulk of this increase was probably caused by my earlier commit bb27d5e ("Don't assume third iteration in loops"). As I evaluated that earlier commit on several open source project, it turns out that on average it's runtime-neutral (or slightly helpful: it reduced the total analysis time by 1.5%) but it can cause runtime spikes on some code: in particular it more than doubled the time to analyze `tmux` (one of the smaller test projects). Further profiling and investigation proved that these spikes were caused by an _increase of analysis scope_ because there was an heuristic that placed functions on a "don't inline this" blacklist if they reached the `-analyzer-max-loop` limit (anywhere, on any one execution path) -- which became significantly rarer when my commit ensured the analyzer no longer "just assumes" four iterations. (With more inlining significantly more entry points use up their allocated budgets, which leads to the increased runtime.) I feel that this heuristic for the "don't inline" blacklist is unjustified and arbitrary, because reaching the "retry without inlining" limit on one path does not imply that inlining the function won't be valuable on other paths -- so I hope that we can eventually replace it with more "natural" limits of the analysis scope. However, the runtime increases are annoying for the users whose project is affected, so I created this quick workaround commit that approximates the "don't inline" blacklist effects of ambiguous loops (where the analyzer doesn't understand the loop condition) without fully reverting the "Don't assume third iteration" commit (to avoid reintroducing the false positives that were eliminated by it). Investigating this issue was a team effort: I'm grateful to Endre Fülöp (gamesh411) who did the bisection and shared his time measurement setup, and Gábor Tóthvári (tigbr) who helped me in profiling. (cherry picked from commit 9600a12)

llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Apr 22, 2025

NagyDonat commented Apr 22, 2025

View reviewed changes

clang/include/clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h Show resolved Hide resolved

NagyDonat commented Apr 22, 2025

View reviewed changes

NagyDonat requested review from Xazax-hun, haoNoQ, gamesh411 and steakhal April 22, 2025 16:01

balazs-benics-sonarsource requested changes Apr 23, 2025

View reviewed changes

Xazax-hun reviewed Apr 24, 2025

View reviewed changes

Correct docs and comments

ba6a2ce

NagyDonat added 2 commits April 25, 2025 13:08

Rename the new test file

4140f53

Document loop handling limitations that are required by the inlining …

6c238f8

…prevention

NagyDonat changed the title ~~[analyzer] Workaround for unintended slowdown (scope increase)~~ [analyzer] Workaround for slowdown spikes (unintended scope increase) Apr 28, 2025

balazs-benics-sonarsource reviewed May 6, 2025

View reviewed changes

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp Show resolved Hide resolved

balazs-benics-sonarsource approved these changes May 7, 2025

View reviewed changes

Rename and flip the config option

32399ee

balazs-benics-sonarsource approved these changes May 8, 2025

View reviewed changes

Fix "the the" typo

e8991c8

NagyDonat merged commit 9600a12 into llvm:main May 12, 2025
11 checks passed

steakhal mentioned this pull request May 12, 2025

[analyzer] Workaround for slowdown spikes (unintended scope increase) (#136720) #139597

Merged

steakhal mentioned this pull request May 27, 2025

[analyzer] New false positives from clang-analyzer-core #141514

Open

	clang_analyzer_numTimesReached(); // no-warning
	clang_analyzer_numTimesReached(); // FIXME: It should be reachable

[analyzer] Workaround for slowdown spikes (unintended scope increase) #136720

[analyzer] Workaround for slowdown spikes (unintended scope increase) #136720

Uh oh!

Conversation

NagyDonat commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 22, 2025

Uh oh!

llvmbot commented Apr 22, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

balazs-benics-sonarsource left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NagyDonat Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NagyDonat Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

balazs-benics-sonarsource commented Apr 23, 2025

Uh oh!

NagyDonat commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

balazs-benics-sonarsource commented Apr 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NagyDonat Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NagyDonat Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NagyDonat commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NagyDonat commented Apr 22, 2025 •

edited

Loading

NagyDonat Apr 24, 2025 •

edited

Loading

NagyDonat Apr 23, 2025 •

edited

Loading

NagyDonat commented Apr 24, 2025 •

edited

Loading

NagyDonat Apr 24, 2025 •

edited

Loading

NagyDonat Apr 24, 2025 •

edited

Loading

NagyDonat commented Apr 24, 2025 •

edited

Loading

NagyDonat commented Apr 24, 2025 •

edited

Loading

NagyDonat commented Apr 25, 2025 •

edited

Loading

NagyDonat commented Apr 25, 2025 •

edited

Loading

NagyDonat commented May 6, 2025 •

edited

Loading

balazs-benics-sonarsource commented May 6, 2025 •

edited

Loading