Skip to content

Reland "[analyzer] Harden safeguards for Z3 query times" #97298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2024
Merged

Reland "[analyzer] Harden safeguards for Z3 query times" #97298

merged 1 commit into from
Jul 1, 2024

Conversation

steakhal
Copy link
Contributor

@steakhal steakhal commented Jul 1, 2024

This is exactly as originally landed in #95129,
but now the minimal Z3 version was increased to meet this change in #96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4


This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be bound by 300ms. Every report equivalence class will be processed in at most 1 second.

The heuristic should have only really marginal observable impact - except for the cases when we had big report eqclasses with long-running (15s) Z3 queries, where previously CSA effectively halted. After this patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)

This is exactly as originally landed in #95129,
but now the minimal Z3 version was increased to meet this change in #96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in
at most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted.
After this patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
@steakhal steakhal added clang Clang issues not falling into any other category clang:static analyzer labels Jul 1, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 1, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Balazs Benics (steakhal)

Changes

This is exactly as originally landed in #95129,
but now the minimal Z3 version was increased to meet this change in #96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4


This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be bound by 300ms. Every report equivalence class will be processed in at most 1 second.

The heuristic should have only really marginal observable impact - except for the cases when we had big report eqclasses with long-running (15s) Z3 queries, where previously CSA effectively halted. After this patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)


Full diff: https://github.com/llvm/llvm-project/pull/97298.diff

6 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+20)
  • (modified) clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h (+34-8)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+10-2)
  • (modified) clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp (+54-12)
  • (modified) clang/test/Analysis/analyzer-config.c (+3)
  • (modified) clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp (+100-16)
diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index f008c9c581d95..29aa6a3b8a16e 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -184,6 +184,26 @@ ANALYZER_OPTION(bool, ShouldCrosscheckWithZ3, "crosscheck-with-z3",
                 "constraint manager backend.",
                 false)
 
+ANALYZER_OPTION(
+    unsigned, Z3CrosscheckEQClassTimeoutThreshold,
+    "crosscheck-with-z3-eqclass-timeout-threshold",
+    "Set a timeout for bug report equivalence classes in milliseconds. "
+    "If we exhaust this threshold, we will drop the bug report eqclass "
+    "instead of doing more Z3 queries. Set 0 for no timeout.", 700)
+
+ANALYZER_OPTION(
+    unsigned, Z3CrosscheckTimeoutThreshold,
+    "crosscheck-with-z3-timeout-threshold",
+    "Set a timeout for individual Z3 queries in milliseconds. "
+    "Set 0 for no timeout.", 300)
+
+ANALYZER_OPTION(
+    unsigned, Z3CrosscheckRLimitThreshold,
+    "crosscheck-with-z3-rlimit-threshold",
+    "Set the Z3 resource limit threshold. This sets a deterministic cutoff "
+    "point for Z3 queries, as longer queries usually consume more resources. "
+    "Set 0 for unlimited.", 400'000)
+
 ANALYZER_OPTION(bool, ShouldReportIssuesInMainSourceFile,
                 "report-in-main-source-file",
                 "Whether or not the diagnostic report should be always "
diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
index 9413fd739f607..439f37fa8604f 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
@@ -25,8 +25,11 @@ class Z3CrosscheckVisitor final : public BugReporterVisitor {
 public:
   struct Z3Result {
     std::optional<bool> IsSAT = std::nullopt;
+    unsigned Z3QueryTimeMilliseconds = 0;
+    unsigned UsedRLimit = 0;
   };
-  explicit Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result);
+  Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result,
+                      const AnalyzerOptions &Opts);
 
   void Profile(llvm::FoldingSetNodeID &ID) const override;
 
@@ -44,21 +47,44 @@ class Z3CrosscheckVisitor final : public BugReporterVisitor {
   /// Holds the constraints in a given path.
   ConstraintMap Constraints;
   Z3Result &Result;
+  const AnalyzerOptions &Opts;
 };
 
 /// The oracle will decide if a report should be accepted or rejected based on
-/// the results of the Z3 solver.
+/// the results of the Z3 solver and the statistics of the queries of a report
+/// equivalenece class.
 class Z3CrosscheckOracle {
 public:
+  explicit Z3CrosscheckOracle(const AnalyzerOptions &Opts) : Opts(Opts) {}
+
   enum Z3Decision {
-    AcceptReport, // The report was SAT.
-    RejectReport, // The report was UNSAT or UNDEF.
+    AcceptReport,  // The report was SAT.
+    RejectReport,  // The report was UNSAT or UNDEF.
+    RejectEQClass, // The heuristic suggests to skip the current eqclass.
   };
 
-  /// Makes a decision for accepting or rejecting the report based on the
-  /// result of the corresponding Z3 query.
-  static Z3Decision
-  interpretQueryResult(const Z3CrosscheckVisitor::Z3Result &Query);
+  /// Updates the internal state with the new Z3Result and makes a decision how
+  /// to proceed:
+  /// - Accept the report if the Z3Result was SAT.
+  /// - Suggest dropping the report equvalence class based on the accumulated
+  ///   statistics.
+  /// - Otherwise, reject the report if the Z3Result was UNSAT or UNDEF.
+  ///
+  /// Conditions for dropping the equivalence class:
+  /// - Accumulative time spent in Z3 checks is more than 700ms in the eqclass.
+  /// - Hit the 300ms query timeout in the report eqclass.
+  /// - Hit the 400'000 rlimit in the report eqclass.
+  ///
+  /// All these thresholds are configurable via the analyzer options.
+  ///
+  /// Refer to
+  /// https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 to
+  /// see why this heuristic was chosen.
+  Z3Decision interpretQueryResult(const Z3CrosscheckVisitor::Z3Result &Meta);
+
+private:
+  const AnalyzerOptions &Opts;
+  unsigned AccumulatedZ3QueryTimeInEqClass = 0; // ms
 };
 
 } // namespace clang::ento
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index c9a7fd0e035c2..e2002bfbe594a 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -89,6 +89,9 @@ STATISTIC(MaxValidBugClassSize,
 
 STATISTIC(NumTimesReportPassesZ3, "Number of reports passed Z3");
 STATISTIC(NumTimesReportRefuted, "Number of reports refuted by Z3");
+STATISTIC(NumTimesReportEQClassAborted,
+          "Number of times a report equivalence class was aborted by the Z3 "
+          "oracle heuristic");
 STATISTIC(NumTimesReportEQClassWasExhausted,
           "Number of times all reports of an equivalence class was refuted");
 
@@ -2840,6 +2843,7 @@ generateVisitorsDiagnostics(PathSensitiveBugReport *R,
 std::optional<PathDiagnosticBuilder> PathDiagnosticBuilder::findValidReport(
     ArrayRef<PathSensitiveBugReport *> &bugReports,
     PathSensitiveBugReporter &Reporter) {
+  Z3CrosscheckOracle Z3Oracle(Reporter.getAnalyzerOptions());
 
   BugPathGetter BugGraph(&Reporter.getGraph(), bugReports);
 
@@ -2871,16 +2875,20 @@ std::optional<PathDiagnosticBuilder> PathDiagnosticBuilder::findValidReport(
         // visitor and check again
         R->clearVisitors();
         Z3CrosscheckVisitor::Z3Result CrosscheckResult;
-        R->addVisitor<Z3CrosscheckVisitor>(CrosscheckResult);
+        R->addVisitor<Z3CrosscheckVisitor>(CrosscheckResult,
+                                           Reporter.getAnalyzerOptions());
 
         // We don't overwrite the notes inserted by other visitors because the
         // refutation manager does not add any new note to the path
         generateVisitorsDiagnostics(R, BugPath->ErrorNode, BRC);
-        switch (Z3CrosscheckOracle::interpretQueryResult(CrosscheckResult)) {
+        switch (Z3Oracle.interpretQueryResult(CrosscheckResult)) {
         case Z3CrosscheckOracle::RejectReport:
           ++NumTimesReportRefuted;
           R->markInvalid("Infeasible constraints", /*Data=*/nullptr);
           continue;
+        case Z3CrosscheckOracle::RejectEQClass:
+          ++NumTimesReportEQClassAborted;
+          return {};
         case Z3CrosscheckOracle::AcceptReport:
           ++NumTimesReportPassesZ3;
           break;
diff --git a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
index a7db44ef8ea30..739db951b3e18 100644
--- a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
+++ b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
@@ -12,26 +12,37 @@
 //===----------------------------------------------------------------------===//
 
 #include "clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h"
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/SMTConv.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/Support/SMTAPI.h"
+#include "llvm/Support/Timer.h"
 
 #define DEBUG_TYPE "Z3CrosscheckOracle"
 
 STATISTIC(NumZ3QueriesDone, "Number of Z3 queries done");
 STATISTIC(NumTimesZ3TimedOut, "Number of times Z3 query timed out");
+STATISTIC(NumTimesZ3ExhaustedRLimit,
+          "Number of times Z3 query exhausted the rlimit");
+STATISTIC(NumTimesZ3SpendsTooMuchTimeOnASingleEQClass,
+          "Number of times report equivalenece class was cut because it spent "
+          "too much time in Z3");
 
 STATISTIC(NumTimesZ3QueryAcceptsReport,
           "Number of Z3 queries accepting a report");
 STATISTIC(NumTimesZ3QueryRejectReport,
           "Number of Z3 queries rejecting a report");
+STATISTIC(NumTimesZ3QueryRejectEQClass,
+          "Number of times rejecting an report equivalenece class");
 
 using namespace clang;
 using namespace ento;
 
-Z3CrosscheckVisitor::Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result)
-    : Constraints(ConstraintMap::Factory().getEmptyMap()), Result(Result) {}
+Z3CrosscheckVisitor::Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result,
+                                         const AnalyzerOptions &Opts)
+    : Constraints(ConstraintMap::Factory().getEmptyMap()), Result(Result),
+      Opts(Opts) {}
 
 void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
                                           const ExplodedNode *EndPathNode,
@@ -41,8 +52,12 @@ void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
 
   // Create a refutation manager
   llvm::SMTSolverRef RefutationSolver = llvm::CreateZ3Solver();
-  RefutationSolver->setBoolParam("model", true);        // Enable model finding
-  RefutationSolver->setUnsignedParam("timeout", 15000); // ms
+  if (Opts.Z3CrosscheckRLimitThreshold)
+    RefutationSolver->setUnsignedParam("rlimit",
+                                       Opts.Z3CrosscheckRLimitThreshold);
+  if (Opts.Z3CrosscheckTimeoutThreshold)
+    RefutationSolver->setUnsignedParam("timeout",
+                                       Opts.Z3CrosscheckTimeoutThreshold); // ms
 
   ASTContext &Ctx = BRC.getASTContext();
 
@@ -63,8 +78,15 @@ void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
   }
 
   // And check for satisfiability
+  llvm::TimeRecord Start = llvm::TimeRecord::getCurrentTime(/*Start=*/true);
   std::optional<bool> IsSAT = RefutationSolver->check();
-  Result = Z3Result{IsSAT};
+  llvm::TimeRecord Diff = llvm::TimeRecord::getCurrentTime(/*Start=*/false);
+  Diff -= Start;
+  Result = Z3Result{
+      IsSAT,
+      static_cast<unsigned>(Diff.getWallTime() * 1000),
+      RefutationSolver->getStatistics()->getUnsigned("rlimit count"),
+  };
 }
 
 void Z3CrosscheckVisitor::addConstraints(
@@ -101,18 +123,38 @@ void Z3CrosscheckVisitor::Profile(llvm::FoldingSetNodeID &ID) const {
 Z3CrosscheckOracle::Z3Decision Z3CrosscheckOracle::interpretQueryResult(
     const Z3CrosscheckVisitor::Z3Result &Query) {
   ++NumZ3QueriesDone;
+  AccumulatedZ3QueryTimeInEqClass += Query.Z3QueryTimeMilliseconds;
 
-  if (!Query.IsSAT.has_value()) {
-    // For backward compatibility, let's accept the first timeout.
-    ++NumTimesZ3TimedOut;
+  if (Query.IsSAT && Query.IsSAT.value()) {
+    ++NumTimesZ3QueryAcceptsReport;
     return AcceptReport;
   }
 
-  if (Query.IsSAT.value()) {
-    ++NumTimesZ3QueryAcceptsReport;
-    return AcceptReport; // sat
+  // Suggest cutting the EQClass if certain heuristics trigger.
+  if (Opts.Z3CrosscheckTimeoutThreshold &&
+      Query.Z3QueryTimeMilliseconds >= Opts.Z3CrosscheckTimeoutThreshold) {
+    ++NumTimesZ3TimedOut;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
+  }
+
+  if (Opts.Z3CrosscheckRLimitThreshold &&
+      Query.UsedRLimit >= Opts.Z3CrosscheckRLimitThreshold) {
+    ++NumTimesZ3ExhaustedRLimit;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
+  }
+
+  if (Opts.Z3CrosscheckEQClassTimeoutThreshold &&
+      AccumulatedZ3QueryTimeInEqClass >
+          Opts.Z3CrosscheckEQClassTimeoutThreshold) {
+    ++NumTimesZ3SpendsTooMuchTimeOnASingleEQClass;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
   }
 
+  // If no cutoff heuristics trigger, and the report is "unsat" or "undef",
+  // then reject the report.
   ++NumTimesZ3QueryRejectReport;
-  return RejectReport; // unsat
+  return RejectReport;
 }
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index fda920fa3951e..2a4c40005a4dc 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -43,6 +43,9 @@
 // CHECK-NEXT: cplusplus.Move:WarnOn = KnownsAndLocals
 // CHECK-NEXT: cplusplus.SmartPtrModeling:ModelSmartPtrDereference = false
 // CHECK-NEXT: crosscheck-with-z3 = false
+// CHECK-NEXT: crosscheck-with-z3-eqclass-timeout-threshold = 700
+// CHECK-NEXT: crosscheck-with-z3-rlimit-threshold = 400000
+// CHECK-NEXT: crosscheck-with-z3-timeout-threshold = 300
 // CHECK-NEXT: ctu-dir = ""
 // CHECK-NEXT: ctu-import-cpp-threshold = 8
 // CHECK-NEXT: ctu-import-threshold = 24
diff --git a/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp b/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
index efad4dd3f03b9..ef07e47ee911b 100644
--- a/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
+++ b/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h"
 #include "gtest/gtest.h"
 
@@ -17,43 +18,126 @@ using Z3Decision = Z3CrosscheckOracle::Z3Decision;
 
 static constexpr Z3Decision AcceptReport = Z3Decision::AcceptReport;
 static constexpr Z3Decision RejectReport = Z3Decision::RejectReport;
+static constexpr Z3Decision RejectEQClass = Z3Decision::RejectEQClass;
 
 static constexpr std::optional<bool> SAT = true;
 static constexpr std::optional<bool> UNSAT = false;
 static constexpr std::optional<bool> UNDEF = std::nullopt;
 
+static unsigned operator""_ms(unsigned long long ms) { return ms; }
+static unsigned operator""_step(unsigned long long rlimit) { return rlimit; }
+
+static const AnalyzerOptions DefaultOpts = [] {
+  AnalyzerOptions Config;
+#define ANALYZER_OPTION_DEPENDS_ON_USER_MODE(TYPE, NAME, CMDFLAG, DESC,        \
+                                             SHALLOW_VAL, DEEP_VAL)            \
+  ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEEP_VAL)
+#define ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEFAULT_VAL)                \
+  Config.NAME = DEFAULT_VAL;
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.def"
+
+  // Remember to update the tests in this file when these values change.
+  // Also update the doc comment of `interpretQueryResult`.
+  assert(Config.Z3CrosscheckRLimitThreshold == 400'000);
+  assert(Config.Z3CrosscheckTimeoutThreshold == 300_ms);
+  // Usually, when the timeout/rlimit threshold is reached, Z3 only slightly
+  // overshoots until it realizes that it overshoot and needs to back off.
+  // Consequently, the measured timeout should be fairly close to the threshold.
+  // Same reasoning applies to the rlimit too.
+  return Config;
+}();
+
 namespace {
 
-struct Z3CrosscheckOracleTest : public testing::Test {
-  Z3Decision interpretQueryResult(const Z3Result &Result) const {
-    return Z3CrosscheckOracle::interpretQueryResult(Result);
+class Z3CrosscheckOracleTest : public testing::Test {
+public:
+  Z3Decision interpretQueryResult(const Z3Result &Result) {
+    return Oracle.interpretQueryResult(Result);
   }
+
+private:
+  Z3CrosscheckOracle Oracle = Z3CrosscheckOracle(DefaultOpts);
 };
 
 TEST_F(Z3CrosscheckOracleTest, AcceptsFirstSAT) {
-  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 1000_step}));
 }
 
 TEST_F(Z3CrosscheckOracleTest, AcceptsSAT) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItGoesOverTime) {
+  // Even if it times out, if it is SAT, we should accept it.
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 310_ms, 1000_step}));
 }
 
-TEST_F(Z3CrosscheckOracleTest, AcceptsFirstTimeout) {
-  ASSERT_EQ(AcceptReport, interpretQueryResult({UNDEF}));
+TEST_F(Z3CrosscheckOracleTest, UNSATWhenItGoesOverTime) {
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 310_ms, 1000_step}));
 }
 
-TEST_F(Z3CrosscheckOracleTest, AcceptsTimeout) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(AcceptReport, interpretQueryResult({UNDEF}));
+TEST_F(Z3CrosscheckOracleTest, RejectsTimeout) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNDEF, 310_ms, 1000_step}));
 }
 
 TEST_F(Z3CrosscheckOracleTest, RejectsUNSATs) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+}
+
+// Testing cut heuristics:
+// =======================
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfSpendsTooMuchTotalTime) {
+  // Simulate long queries, that barely doesn't trigger the timeout.
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItSpendsTooMuchTotalTime) {
+  // Simulate long queries, that barely doesn't trigger the timeout.
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 290_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfAttemptsManySmallQueries) {
+  // Simulate quick, but many queries: 35 quick UNSAT queries.
+  // 35*20ms = 700ms, which is equal to the 700ms threshold.
+  for (int i = 0; i < 35; ++i) {
+    ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 20_ms, 1000_step}));
+  }
+  // Do one more to trigger the heuristic.
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 1_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenIfAttemptsManySmallQueries) {
+  // Simulate quick, but many queries: 35 quick UNSAT queries.
+  // 35*20ms = 700ms, which is equal to the 700ms threshold.
+  for (int i = 0; i < 35; ++i) {
+    ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 20_ms, 1000_step}));
+  }
+  // Do one more to trigger the heuristic, but given this was SAT, we still
+  // accept the query.
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 200_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfExhaustsRLimit) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNDEF, 25_ms, 405'000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItExhaustsRLimit) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 405'000_step}));
 }
 
 } // namespace

@steakhal
Copy link
Contributor Author

steakhal commented Jul 1, 2024

This is the continuation patch of #97265.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be re-landed as well; when it was merged earlier the only issue was the incompatibility with old Z3 and that was addressed since then (by bumping the version requirement).

@steakhal steakhal merged commit ae570d8 into llvm:main Jul 1, 2024
8 of 9 checks passed
@steakhal steakhal deleted the bb/reland-z3-safeguards branch July 1, 2024 15:22
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
This is exactly as originally landed in llvm#95129,
but now the minimal Z3 version was increased to meet this change in llvm#96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
This is exactly as originally landed in llvm#95129,
but now the minimal Z3 version was increased to meet this change in llvm#96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
@NagyDonat
Copy link
Contributor

@steakhal As I was browsing the list of analyzer options, I was surprised to see that the numerical unsigned options added by this commit are placed in the "Boolean analyzer options" section of AnalyzerOptions.def.

I understand that it's natural to group them together with the parent option "should-crosscheck-with-z3", but I still think that it would be good to either move them to the "Unsigned analyzer options" section or perhaps reorganize the whole file to follow a more natural scheme, e.g. semantic relatedness between options or perhaps a "dumb, but obvious" alphabetical ordering.

@steakhal
Copy link
Contributor Author

@steakhal As I was browsing the list of analyzer options, I was surprised to see that the numerical unsigned options added by this commit are placed in the "Boolean analyzer options" section of AnalyzerOptions.def.
I understand that it's natural to group them together with the parent option "should-crosscheck-with-z3"

Intentionally done so, like you inferred.

[...], but I still think that it would be good to either move them to the "Unsigned analyzer options" section or perhaps reorganize the whole file to follow a more natural scheme, e.g. semantic relatedness between options or perhaps a "dumb, but obvious" alphabetical ordering.

I see little value of moving these around. I think this grouping by kind was arbitrary, and a bad choice of the past.
Alphabetical ordering usually works out because it's a fairly accepted design of having common options sharing a common prefix - which fits well with alphabetical ordering.
However, it only works if the options were named like that. And I'm pretty sure that's not the case we have here.
Renaming would be a really intrusive change, so I'd advise not pursuing. On the technical side, I frequently reach out to this file if I want to learn more about something, look up the relevant option, and hop to the commit where the feature was introduced.
Even though this workflow applies to all other files, I feel I'm doing this much more often here than with any other files; so this argument wouldn't hold for those, but may be something to consider in this specific case.

In any case, I'm weakly against. But I could be convinced otherwise, assuming that there is a "better" ordering (maybe alphabetical), or grouping for these options and we also have a way of enforcing it in CI.

@NagyDonat
Copy link
Contributor

Thanks for the explanation, I agree with your POV and I don't think that we need a drastic change like renaming or reordering everything.

However, in this case I think it would be good to remove the type based section headers, which are no longer accurate. When someone adds a new option, they can still "follow the visible trend" based on its type, but perhaps we'll se more options that are placed close to other related options (with a different type).

@steakhal
Copy link
Contributor Author

Thanks for the explanation, I agree with your POV and I don't think that we need a drastic change like renaming or reordering everything.

However, in this case I think it would be good to remove the type based section headers, which are no longer accurate. When someone adds a new option, they can still "follow the visible trend" based on its type, but perhaps we'll se more options that are placed close to other related options (with a different type).

+1, Let's drop those comments.

Szelethus added a commit to Szelethus/llvm-project that referenced this pull request Dec 2, 2024
Discussion here:
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520/15?u=szelethus

The original patch, llvm#97298 introduced new timeouts backed by thorough
testing and measurements to keep the running time of Z3 within
reasonable limits. The measurements also showed that only certain
reports and certain TUs were responsible for the poor performance of Z3
refutation.

Unfortunately, it seems like that on machines with different
characteristics (slower machines) the current timeouts don't just axe
0.01% of reports, but many more as well. Considering that timeouts are
inherently nondeterministic as a cutoff point, this lead reports sets
being vastly different on the same projects with the same configuration.
The discussion link shows that all configurations introduced in the
patch with their default values lead to severa nondeterminism of the
analyzer. As we, and others use the analyzer as a gating tool for PRs,
we should revert to the original defaults.

We should respect that
* There are still parts of the analyzer that are either proven or
  suspected to contain nondeterministic code (like pointer sets),
* A 15s timeout is more likely to hit the same reports every time on a
  wider range of machines, but is still inherently nondeterministic, but
  an infinite timeout leads to the tool hanging,
* If you measure the performance of the analyzer on your machines, you
  can and should achieve some speedup with little or no observable
  nondeterminism.
Szelethus added a commit that referenced this pull request Dec 13, 2024
Discussion here:

https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520/15?u=szelethus

The original patch, #97298 introduced new timeouts backed by thorough
testing and measurements to keep the running time of Z3 within
reasonable limits. The measurements also showed that only certain
reports and certain TUs were responsible for the poor performance of Z3
refutation.

Unfortunately, it seems like that on machines with different
characteristics (slower machines) the current timeouts don't just axe
0.01% of reports, but many more as well. Considering that timeouts are
inherently nondeterministic as a cutoff point, this lead reports sets
being vastly different on the same projects with the same configuration.
The discussion link shows that all configurations introduced in the
patch with their default values lead to severa nondeterminism of the
analyzer. As we, and others use the analyzer as a gating tool for PRs,
we should revert to the original defaults.

We should respect that
* There are still parts of the analyzer that are either proven or
suspected to contain nondeterministic code (like pointer sets),
* A 15s timeout is more likely to hit the same reports every time on a
wider range of machines, but is still inherently nondeterministic, but
an infinite timeout leads to the tool hanging,
* If you measure the performance of the analyzer on your machines, you
can and should achieve some speedup with little or no observable
nondeterminism.

---------

Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants