Skip to content

[analyzer] Detect leaks of stack addresses via output params, indirect globals 3/3 #105648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 28, 2024

Conversation

necto
Copy link
Contributor

@necto necto commented Aug 22, 2024

Fix some false negatives of StackAddrEscapeChecker:

  • Output parameters
    void top(int **out) {
      int local = 42;
      *out = &local; // Noncompliant
    }
    
  • Indirect global pointers
    int **global;
    
    void top() {
      int local = 42;
      *global = &local; // Noncompliant
    }
    

Note that now StackAddrEscapeChecker produces a diagnostic if a function
with an output parameter is analyzed as top-level or as a callee. I took
special care to make sure the reports point to the same primary location
and, in many cases, feature the same primary message. That is the
motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp

To avoid false positive reports when a global indirect pointer is
assigned a local address, invalidated, and then reset, I rely on the
fact that the invalidation symbol will be a DerivedSymbol of a
ConjuredSymbol that refers to the same memory region.

The checker still has a false negative for non-trivial escaping via a
returned value. It requires a more sophisticated traversal akin to
scanReachableSymbols, which out of the scope of this change.

CPP-4734


This is the last of the 3 stacked PRs, it must not be merged before #105652 and #105653

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Aug 22, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Arseniy Zaostrovnykh (necto)

Changes

Note, I prepared this PR to be rebased and merged with three commits that are self-sufficient and build on each other.

Fix some false negatives of StackAddrEscapeChecker:

  • Output parameters
    void top(int **out) {
      int local = 42;
      *out = &local; // Noncompliant
    }
    
  • Indirect global pointers
    int **global;
    
    void top() {
      int local = 42;
      *global = &local; // Noncompliant
    }
    

Note that now StackAddrEscapeChecker produces a diagnostic if a function with an output parameter is analyzed as top-level or as a callee. I took special care to make sure the reports point to the same primary location and, in many cases, feature the same primary message. That is the motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp

To avoid false positive reports when a global indirect pointer is assigned a local address, invalidated, and then reset, I rely on the fact that the invalidation symbol will be a DerivedSymbol of a ConjuredSymbol that refers to the same memory region.

The checker still has a false negative for non-trivial escaping via a returned value. It requires an active

https://sonarsource.atlassian.net/browse/CPP-4734


Patch is 40.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105648.diff

10 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp (+128-38)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+12-1)
  • (modified) clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp (+1-1)
  • (modified) clang/test/Analysis/copy-elision.cpp (+10-10)
  • (modified) clang/test/Analysis/incorrect-checker-names.cpp (+2-2)
  • (modified) clang/test/Analysis/loop-block-counts.c (+1-1)
  • (modified) clang/test/Analysis/stack-addr-ps.c (+31)
  • (modified) clang/test/Analysis/stack-addr-ps.cpp (+608-1)
  • (modified) clang/test/Analysis/stack-capture-leak-no-arc.mm (+2-2)
  • (modified) clang/test/Analysis/stackaddrleak.c (+4-4)
diff --git a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
index ea09c43cc5ce90..ef477a808fac17 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
@@ -288,12 +288,89 @@ void StackAddrEscapeChecker::checkPreStmt(const ReturnStmt *RS,
   EmitStackError(C, R, RetE);
 }
 
+static const MemSpaceRegion *getStackOrGlobalSpaceRegion(const MemRegion *R) {
+  assert(R);
+  if (const auto *MemSpace = R->getMemorySpace()) {
+    if (const auto *SSR = MemSpace->getAs<StackSpaceRegion>())
+      return SSR;
+    if (const auto *GSR = MemSpace->getAs<GlobalsSpaceRegion>())
+      return GSR;
+  }
+  // If R describes a lambda capture, it will be a symbolic region
+  // referring to a field region of another symbolic region.
+  if (const auto *SymReg = R->getBaseRegion()->getAs<SymbolicRegion>()) {
+    if (const auto *OriginReg = SymReg->getSymbol()->getOriginRegion())
+      return getStackOrGlobalSpaceRegion(OriginReg);
+  }
+  return nullptr;
+}
+
+const MemRegion *getOriginBaseRegion(const MemRegion *Referrer) {
+  Referrer = Referrer->getBaseRegion();
+  while (const auto *SymReg = dyn_cast<SymbolicRegion>(Referrer)) {
+    Referrer = SymReg->getSymbol()->getOriginRegion()->getBaseRegion();
+  }
+  return Referrer;
+}
+
+std::optional<std::string> printReferrer(const MemRegion *Referrer) {
+  assert(Referrer);
+  const StringRef ReferrerMemorySpace = [](const MemSpaceRegion *Space) {
+    if (isa<StaticGlobalSpaceRegion>(Space))
+      return "static";
+    if (isa<GlobalsSpaceRegion>(Space))
+      return "global";
+    // UnknownSpaceRegion usually refers to pointers-to-pointers,
+    // might be in top frame or not with pointers-to-pointers-to-pointers
+    assert((isa<UnknownSpaceRegion, StackSpaceRegion>(Space)));
+    // These two cases are deliberately combined to keep the message identical
+    // between the top-level and inlined analysis of the same function
+    return "caller";
+  }(getStackOrGlobalSpaceRegion(Referrer));
+
+  while (!Referrer->canPrintPretty()) {
+    if (const auto *SymReg = dyn_cast<SymbolicRegion>(Referrer)) {
+      Referrer = SymReg->getSymbol()->getOriginRegion()->getBaseRegion();
+    } else if (isa<CXXThisRegion>(Referrer)) {
+      // Skip members of a class, it is handled by
+      // warn_bind_ref_member_to_parameter_addr
+      return std::nullopt;
+    } else {
+      Referrer->dump();
+      assert(false && "Unexpected referrer region type.");
+      return std::nullopt;
+    }
+  }
+  assert(Referrer);
+  assert(Referrer->canPrintPretty());
+
+  std::string buf;
+  llvm::raw_string_ostream os(buf);
+  os << ReferrerMemorySpace << " variable ";
+  Referrer->printPretty(os);
+  return buf;
+}
+
 void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
                                               CheckerContext &Ctx) const {
   if (!ChecksEnabled[CK_StackAddrEscapeChecker])
     return;
 
-  ProgramStateRef State = Ctx.getState();
+  ExplodedNode *Node = Ctx.getPredecessor();
+
+  bool ExitingTopFrame =
+      Ctx.getPredecessor()->getLocationContext()->inTopFrame();
+
+  if (ExitingTopFrame && Node->getLocation().getTag() &&
+      Node->getLocation().getTag()->getTagDescription() ==
+          "ExprEngine : Clean Node" &&
+      Node->getFirstPred()) {
+    // When finishing analysis of a top-level function, engine proactively
+    // removes dead symbols thus preventing this checker from looking through
+    // the output parameters. Take 1 step back, to the node where these symbols
+    // and their bindings are still present
+    Node = Node->getFirstPred();
+  }
 
   // Iterate over all bindings to global variables and see if it contains
   // a memory region in the stack space.
@@ -301,40 +378,64 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
   private:
     CheckerContext &Ctx;
     const StackFrameContext *PoppedFrame;
+    const bool TopFrame;
 
     /// Look for stack variables referring to popped stack variables.
     /// Returns true only if it found some dangling stack variables
     /// referred by an other stack variable from different stack frame.
     bool checkForDanglingStackVariable(const MemRegion *Referrer,
                                        const MemRegion *Referred) {
-      const auto *ReferrerMemSpace =
-          Referrer->getMemorySpace()->getAs<StackSpaceRegion>();
+      const auto *ReferrerMemSpace = getStackOrGlobalSpaceRegion(Referrer);
       const auto *ReferredMemSpace =
           Referred->getMemorySpace()->getAs<StackSpaceRegion>();
 
       if (!ReferrerMemSpace || !ReferredMemSpace)
         return false;
 
-      const auto *ReferrerFrame = ReferrerMemSpace->getStackFrame();
-      const auto *ReferredFrame = ReferredMemSpace->getStackFrame();
+      const auto *ReferrerStackSpace =
+          ReferrerMemSpace->getAs<StackSpaceRegion>();
 
-      if (ReferrerMemSpace && ReferredMemSpace) {
-        if (ReferredFrame == PoppedFrame &&
-            ReferrerFrame->isParentOf(PoppedFrame)) {
-          V.emplace_back(Referrer, Referred);
-          return true;
-        }
+      if (!ReferrerStackSpace)
+        return false;
+
+      if (const auto *ReferredFrame = ReferredMemSpace->getStackFrame();
+          ReferredFrame != PoppedFrame) {
+        return false;
+      }
+
+      if (ReferrerStackSpace->getStackFrame()->isParentOf(PoppedFrame)) {
+        V.emplace_back(Referrer, Referred);
+        return true;
+      }
+      if (isa<StackArgumentsSpaceRegion>(ReferrerMemSpace) &&
+          ReferrerStackSpace->getStackFrame() == PoppedFrame && TopFrame) {
+        // Output parameter of a top-level function
+        V.emplace_back(Referrer, Referred);
+        return true;
       }
       return false;
     }
 
+    void updateInvalidatedRegions(const MemRegion *Region) {
+      if (const auto *SymReg = Region->getAs<SymbolicRegion>()) {
+        SymbolRef Symbol = SymReg->getSymbol();
+        if (const auto *DerS = dyn_cast<SymbolDerived>(Symbol);
+            DerS && isa_and_nonnull<SymbolConjured>(DerS->getParentSymbol())) {
+          InvalidatedRegions.insert(Symbol->getOriginRegion()->getBaseRegion());
+        }
+      }
+    }
+
   public:
     SmallVector<std::pair<const MemRegion *, const MemRegion *>, 10> V;
+    llvm::SmallPtrSet<const MemRegion *, 4> InvalidatedRegions;
 
-    CallBack(CheckerContext &CC) : Ctx(CC), PoppedFrame(CC.getStackFrame()) {}
+    CallBack(CheckerContext &CC, bool TopFrame)
+        : Ctx(CC), PoppedFrame(CC.getStackFrame()), TopFrame(TopFrame) {}
 
     bool HandleBinding(StoreManager &SMgr, Store S, const MemRegion *Region,
                        SVal Val) override {
+      updateInvalidatedRegions(Region);
       const MemRegion *VR = Val.getAsRegion();
       if (!VR)
         return true;
@@ -343,7 +444,8 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
         return true;
 
       // Check the globals for the same.
-      if (!isa<GlobalsSpaceRegion>(Region->getMemorySpace()))
+      if (!isa_and_nonnull<GlobalsSpaceRegion>(
+              getStackOrGlobalSpaceRegion(Region)))
         return true;
       if (VR && VR->hasStackStorage() && !isNotInCurrentFrame(VR, Ctx))
         V.emplace_back(Region, VR);
@@ -351,7 +453,8 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
     }
   };
 
-  CallBack Cb(Ctx);
+  CallBack Cb(Ctx, ExitingTopFrame);
+  ProgramStateRef State = Node->getState();
   State->getStateManager().getStoreManager().iterBindings(State->getStore(),
                                                           Cb);
 
@@ -359,28 +462,31 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
     return;
 
   // Generate an error node.
-  ExplodedNode *N = Ctx.generateNonFatalErrorNode(State);
+  ExplodedNode *N = Ctx.generateNonFatalErrorNode(State, Node);
   if (!N)
     return;
 
   if (!BT_stackleak)
     BT_stackleak =
         std::make_unique<BugType>(CheckNames[CK_StackAddrEscapeChecker],
-                                  "Stack address stored into global variable");
+                                  "Stack address leaks outside of stack frame");
 
   for (const auto &P : Cb.V) {
     const MemRegion *Referrer = P.first->getBaseRegion();
     const MemRegion *Referred = P.second;
+    if (Cb.InvalidatedRegions.contains(getOriginBaseRegion(Referrer))) {
+      continue;
+    }
 
     // Generate a report for this bug.
     const StringRef CommonSuffix =
-        "upon returning to the caller.  This will be a dangling reference";
+        " upon returning to the caller.  This will be a dangling reference";
     SmallString<128> Buf;
     llvm::raw_svector_ostream Out(Buf);
     const SourceRange Range = genName(Out, Referred, Ctx.getASTContext());
 
     if (isa<CXXTempObjectRegion, CXXLifetimeExtendedObjectRegion>(Referrer)) {
-      Out << " is still referred to by a temporary object on the stack "
+      Out << " is still referred to by a temporary object on the stack"
           << CommonSuffix;
       auto Report =
           std::make_unique<PathSensitiveBugReport>(*BT_stackleak, Out.str(), N);
@@ -390,28 +496,12 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
       return;
     }
 
-    const StringRef ReferrerMemorySpace = [](const MemSpaceRegion *Space) {
-      if (isa<StaticGlobalSpaceRegion>(Space))
-        return "static";
-      if (isa<GlobalsSpaceRegion>(Space))
-        return "global";
-      assert(isa<StackSpaceRegion>(Space));
-      return "stack";
-    }(Referrer->getMemorySpace());
-
-    // We should really only have VarRegions here.
-    // Anything else is really surprising, and we should get notified if such
-    // ever happens.
-    const auto *ReferrerVar = dyn_cast<VarRegion>(Referrer);
-    if (!ReferrerVar) {
-      assert(false && "We should have a VarRegion here");
-      continue; // Defensively skip this one.
+    auto ReferrerVariable = printReferrer(Referrer);
+    if (!ReferrerVariable) {
+      continue;
     }
-    const std::string ReferrerVarName =
-        ReferrerVar->getDecl()->getDeclName().getAsString();
 
-    Out << " is still referred to by the " << ReferrerMemorySpace
-        << " variable '" << ReferrerVarName << "' " << CommonSuffix;
+    Out << " is still referred to by the " << *ReferrerVariable << CommonSuffix;
     auto Report =
         std::make_unique<PathSensitiveBugReport>(*BT_stackleak, Out.str(), N);
     if (Range.isValid())
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index d73dc40cf03fbb..ea916c3585cadc 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -2436,8 +2436,19 @@ PathSensitiveBugReport::getLocation() const {
     if (auto FE = P.getAs<FunctionExitPoint>()) {
       if (const ReturnStmt *RS = FE->getStmt())
         return PathDiagnosticLocation::createBegin(RS, SM, LC);
+
+      // If we are exiting a destructor call, it is more useful to point to the
+      // next stmt which is usually the temporary declaration.
+      // For non-destructor and non-top-level calls, the next stmt will still
+      // refer to the last executed stmt of the body.
+      S = ErrorNode->getNextStmtForDiagnostics();
+      // If next stmt is not found, it is likely the end of a top-level function
+      // analysis. find the last execution statement then.
+      if (!S)
+        S = ErrorNode->getPreviousStmtForDiagnostics();
     }
-    S = ErrorNode->getNextStmtForDiagnostics();
+    if (!S)
+      S = ErrorNode->getNextStmtForDiagnostics();
   }
 
   if (S) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
index f84da769d182f8..11cef00ada330b 100644
--- a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
@@ -376,7 +376,7 @@ const Stmt *ExplodedNode::getNextStmtForDiagnostics() const {
 
 const Stmt *ExplodedNode::getPreviousStmtForDiagnostics() const {
   for (const ExplodedNode *N = getFirstPred(); N; N = N->getFirstPred())
-    if (const Stmt *S = N->getStmtForDiagnostics())
+    if (const Stmt *S = N->getStmtForDiagnostics(); S && !isa<CompoundStmt>(S))
       return S;
 
   return nullptr;
diff --git a/clang/test/Analysis/copy-elision.cpp b/clang/test/Analysis/copy-elision.cpp
index 991f325c05853d..423c4519f5bfc6 100644
--- a/clang/test/Analysis/copy-elision.cpp
+++ b/clang/test/Analysis/copy-elision.cpp
@@ -158,19 +158,19 @@ ClassWithoutDestructor make1(AddressVector<ClassWithoutDestructor> &v) {
   return ClassWithoutDestructor(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithoutDestructor make2(AddressVector<ClassWithoutDestructor> &v) {
   return make1(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithoutDestructor make3(AddressVector<ClassWithoutDestructor> &v) {
   return make2(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 
 void testMultipleReturns() {
@@ -193,7 +193,7 @@ void testMultipleReturns() {
 void consume(ClassWithoutDestructor c) {
   c.push();
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'c' is still referred to by the stack variable 'v' upon returning \
+variable 'c' is still referred to by the caller variable 'v' upon returning \
 to the caller}}
 }
 
@@ -267,7 +267,7 @@ struct TestCtorInitializer {
     : c(ClassWithDestructor(v)) {}
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 };
 
 void testCtorInitializer() {
@@ -303,19 +303,19 @@ ClassWithDestructor make1(AddressVector<ClassWithDestructor> &v) {
   return ClassWithDestructor(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithDestructor make2(AddressVector<ClassWithDestructor> &v) {
   return make1(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithDestructor make3(AddressVector<ClassWithDestructor> &v) {
   return make2(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 
 void testMultipleReturnsWithDestructors() {
@@ -360,7 +360,7 @@ void testMultipleReturnsWithDestructors() {
 void consume(ClassWithDestructor c) {
   c.push();
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'c' is still referred to by the stack variable 'v' upon returning \
+variable 'c' is still referred to by the caller variable 'v' upon returning \
 to the caller}}
 }
 
@@ -407,7 +407,7 @@ struct Foo {
 Foo make1(Foo **r) {
   return Foo(r);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
-object of type 'Foo' is still referred to by the stack \
+object of type 'Foo' is still referred to by the caller \
 variable 'z' upon returning to the caller}}
 }
 
diff --git a/clang/test/Analysis/incorrect-checker-names.cpp b/clang/test/Analysis/incorrect-checker-names.cpp
index 9854a503fc4f62..8dee0b6f468581 100644
--- a/clang/test/Analysis/incorrect-checker-names.cpp
+++ b/clang/test/Analysis/incorrect-checker-names.cpp
@@ -16,5 +16,5 @@ char const *p;
 void f0() {
   char const str[] = "This will change";
   p = str;
-} // expected-warning{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference [core.StackAddressEscape]}}
-// expected-note@-1{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference}}
+} // expected-warning@-1{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference [core.StackAddressEscape]}}
+// expected-note@-2{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference}}
diff --git a/clang/test/Analysis/loop-block-counts.c b/clang/test/Analysis/loop-block-counts.c
index 9af67b1b632f51..ae3705f8bd7ccc 100644
--- a/clang/test/Analysis/loop-block-counts.c
+++ b/clang/test/Analysis/loop-block-counts.c
@@ -6,7 +6,7 @@ void callee(void **p) {
   int x;
   *p = &x;
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'x' is still referred to by the stack variable 'arr' upon \
+variable 'x' is still referred to by the caller variable 'arr' upon \
 returning to the caller}}
 }
 
diff --git a/clang/test/Analysis/stack-addr-ps.c b/clang/test/Analysis/stack-addr-ps.c
index e69ab4189b524f..138b8c16b02bde 100644
--- a/clang/test/Analysis/stack-addr-ps.c
+++ b/clang/test/Analysis/stack-addr-ps.c
@@ -95,3 +95,34 @@ void callTestRegister(void) {
     char buf[20];
     testRegister(buf); // no-warning
 }
+
+void top_level_leaking(int **out) {
+  int local = 42;
+  *out = &local; // expected-warning{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'out'}}
+}
+
+void callee_leaking_via_param(int **out) {
+  int local = 1;
+  *out = &local;
+  // expected-warning@-1{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'ptr'}}
+}
+
+void caller_for_leaking_callee() {
+  int *ptr = 0;
+  callee_leaking_via_param(&ptr);
+}
+
+void callee_nested_leaking(int **out) {
+  int local = 1;
+  *out = &local;
+  // expected-warning@-1{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'ptr'}}
+}
+
+void caller_mid_for_nested_leaking(int **mid) {
+  callee_nested_leaking(mid);
+}
+
+void caller_for_nested_leaking() {
+  int *ptr = 0;
+  caller_mid_for_nested_leaking(&ptr);
+}
diff --git a/clang/test/Analysis/stack-addr-ps.cpp b/clang/test/Analysis/stack-addr-ps.cpp
index bd856be2b8d690..b98fc3beade761 100644
--- a/clang/test/Analysis/stack-addr-ps.cpp
+++ b/clang/test/Analysis/stack-addr-ps.cpp
@@ -140,7 +140,7 @@ void write_stack_address_to(char **q) {
   char local;
   *q = &local;
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'local' is still referred to by the stack variable 'p' upon \
+variable 'local' is still referred to by the caller variable 'p' upon \
 returning to the caller}}
 }
 
@@ -161,3 +161,610 @@ C make1() {
 void test_copy_elision() {
   C c1 = make1();
 }
+
+namespace leaking_via_direct_pointer {
+void* returned_direct_pointer_top() {
+  int local = 42;
+  int* p = &local;
+  return p; // expected-warning{{associated with local variable 'local' returned}}
+}
+
+int* returned_direct_pointer_callee() {
+  int local = 42;
+  int* p = &local;
+  return p; // expected-warning{{associated with local variable 'local' returned}}
+}
+...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

Changes

Note, I prepared this PR to be rebased and merged with three commits that are self-sufficient and build on each other.

Fix some false negatives of StackAddrEscapeChecker:

  • Output parameters
    void top(int **out) {
      int local = 42;
      *out = &amp;local; // Noncompliant
    }
    
  • Indirect global pointers
    int **global;
    
    void top() {
      int local = 42;
      *global = &amp;local; // Noncompliant
    }
    

Note that now StackAddrEscapeChecker produces a diagnostic if a function with an output parameter is analyzed as top-level or as a callee. I took special care to make sure the reports point to the same primary location and, in many cases, feature the same primary message. That is the motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp

To avoid false positive reports when a global indirect pointer is assigned a local address, invalidated, and then reset, I rely on the fact that the invalidation symbol will be a DerivedSymbol of a ConjuredSymbol that refers to the same memory region.

The checker still has a false negative for non-trivial escaping via a returned value. It requires an active

https://sonarsource.atlassian.net/browse/CPP-4734


Patch is 40.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105648.diff

10 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp (+128-38)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+12-1)
  • (modified) clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp (+1-1)
  • (modified) clang/test/Analysis/copy-elision.cpp (+10-10)
  • (modified) clang/test/Analysis/incorrect-checker-names.cpp (+2-2)
  • (modified) clang/test/Analysis/loop-block-counts.c (+1-1)
  • (modified) clang/test/Analysis/stack-addr-ps.c (+31)
  • (modified) clang/test/Analysis/stack-addr-ps.cpp (+608-1)
  • (modified) clang/test/Analysis/stack-capture-leak-no-arc.mm (+2-2)
  • (modified) clang/test/Analysis/stackaddrleak.c (+4-4)
diff --git a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
index ea09c43cc5ce90..ef477a808fac17 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StackAddrEscapeChecker.cpp
@@ -288,12 +288,89 @@ void StackAddrEscapeChecker::checkPreStmt(const ReturnStmt *RS,
   EmitStackError(C, R, RetE);
 }
 
+static const MemSpaceRegion *getStackOrGlobalSpaceRegion(const MemRegion *R) {
+  assert(R);
+  if (const auto *MemSpace = R->getMemorySpace()) {
+    if (const auto *SSR = MemSpace->getAs<StackSpaceRegion>())
+      return SSR;
+    if (const auto *GSR = MemSpace->getAs<GlobalsSpaceRegion>())
+      return GSR;
+  }
+  // If R describes a lambda capture, it will be a symbolic region
+  // referring to a field region of another symbolic region.
+  if (const auto *SymReg = R->getBaseRegion()->getAs<SymbolicRegion>()) {
+    if (const auto *OriginReg = SymReg->getSymbol()->getOriginRegion())
+      return getStackOrGlobalSpaceRegion(OriginReg);
+  }
+  return nullptr;
+}
+
+const MemRegion *getOriginBaseRegion(const MemRegion *Referrer) {
+  Referrer = Referrer->getBaseRegion();
+  while (const auto *SymReg = dyn_cast<SymbolicRegion>(Referrer)) {
+    Referrer = SymReg->getSymbol()->getOriginRegion()->getBaseRegion();
+  }
+  return Referrer;
+}
+
+std::optional<std::string> printReferrer(const MemRegion *Referrer) {
+  assert(Referrer);
+  const StringRef ReferrerMemorySpace = [](const MemSpaceRegion *Space) {
+    if (isa<StaticGlobalSpaceRegion>(Space))
+      return "static";
+    if (isa<GlobalsSpaceRegion>(Space))
+      return "global";
+    // UnknownSpaceRegion usually refers to pointers-to-pointers,
+    // might be in top frame or not with pointers-to-pointers-to-pointers
+    assert((isa<UnknownSpaceRegion, StackSpaceRegion>(Space)));
+    // These two cases are deliberately combined to keep the message identical
+    // between the top-level and inlined analysis of the same function
+    return "caller";
+  }(getStackOrGlobalSpaceRegion(Referrer));
+
+  while (!Referrer->canPrintPretty()) {
+    if (const auto *SymReg = dyn_cast<SymbolicRegion>(Referrer)) {
+      Referrer = SymReg->getSymbol()->getOriginRegion()->getBaseRegion();
+    } else if (isa<CXXThisRegion>(Referrer)) {
+      // Skip members of a class, it is handled by
+      // warn_bind_ref_member_to_parameter_addr
+      return std::nullopt;
+    } else {
+      Referrer->dump();
+      assert(false && "Unexpected referrer region type.");
+      return std::nullopt;
+    }
+  }
+  assert(Referrer);
+  assert(Referrer->canPrintPretty());
+
+  std::string buf;
+  llvm::raw_string_ostream os(buf);
+  os << ReferrerMemorySpace << " variable ";
+  Referrer->printPretty(os);
+  return buf;
+}
+
 void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
                                               CheckerContext &Ctx) const {
   if (!ChecksEnabled[CK_StackAddrEscapeChecker])
     return;
 
-  ProgramStateRef State = Ctx.getState();
+  ExplodedNode *Node = Ctx.getPredecessor();
+
+  bool ExitingTopFrame =
+      Ctx.getPredecessor()->getLocationContext()->inTopFrame();
+
+  if (ExitingTopFrame && Node->getLocation().getTag() &&
+      Node->getLocation().getTag()->getTagDescription() ==
+          "ExprEngine : Clean Node" &&
+      Node->getFirstPred()) {
+    // When finishing analysis of a top-level function, engine proactively
+    // removes dead symbols thus preventing this checker from looking through
+    // the output parameters. Take 1 step back, to the node where these symbols
+    // and their bindings are still present
+    Node = Node->getFirstPred();
+  }
 
   // Iterate over all bindings to global variables and see if it contains
   // a memory region in the stack space.
@@ -301,40 +378,64 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
   private:
     CheckerContext &Ctx;
     const StackFrameContext *PoppedFrame;
+    const bool TopFrame;
 
     /// Look for stack variables referring to popped stack variables.
     /// Returns true only if it found some dangling stack variables
     /// referred by an other stack variable from different stack frame.
     bool checkForDanglingStackVariable(const MemRegion *Referrer,
                                        const MemRegion *Referred) {
-      const auto *ReferrerMemSpace =
-          Referrer->getMemorySpace()->getAs<StackSpaceRegion>();
+      const auto *ReferrerMemSpace = getStackOrGlobalSpaceRegion(Referrer);
       const auto *ReferredMemSpace =
           Referred->getMemorySpace()->getAs<StackSpaceRegion>();
 
       if (!ReferrerMemSpace || !ReferredMemSpace)
         return false;
 
-      const auto *ReferrerFrame = ReferrerMemSpace->getStackFrame();
-      const auto *ReferredFrame = ReferredMemSpace->getStackFrame();
+      const auto *ReferrerStackSpace =
+          ReferrerMemSpace->getAs<StackSpaceRegion>();
 
-      if (ReferrerMemSpace && ReferredMemSpace) {
-        if (ReferredFrame == PoppedFrame &&
-            ReferrerFrame->isParentOf(PoppedFrame)) {
-          V.emplace_back(Referrer, Referred);
-          return true;
-        }
+      if (!ReferrerStackSpace)
+        return false;
+
+      if (const auto *ReferredFrame = ReferredMemSpace->getStackFrame();
+          ReferredFrame != PoppedFrame) {
+        return false;
+      }
+
+      if (ReferrerStackSpace->getStackFrame()->isParentOf(PoppedFrame)) {
+        V.emplace_back(Referrer, Referred);
+        return true;
+      }
+      if (isa<StackArgumentsSpaceRegion>(ReferrerMemSpace) &&
+          ReferrerStackSpace->getStackFrame() == PoppedFrame && TopFrame) {
+        // Output parameter of a top-level function
+        V.emplace_back(Referrer, Referred);
+        return true;
       }
       return false;
     }
 
+    void updateInvalidatedRegions(const MemRegion *Region) {
+      if (const auto *SymReg = Region->getAs<SymbolicRegion>()) {
+        SymbolRef Symbol = SymReg->getSymbol();
+        if (const auto *DerS = dyn_cast<SymbolDerived>(Symbol);
+            DerS && isa_and_nonnull<SymbolConjured>(DerS->getParentSymbol())) {
+          InvalidatedRegions.insert(Symbol->getOriginRegion()->getBaseRegion());
+        }
+      }
+    }
+
   public:
     SmallVector<std::pair<const MemRegion *, const MemRegion *>, 10> V;
+    llvm::SmallPtrSet<const MemRegion *, 4> InvalidatedRegions;
 
-    CallBack(CheckerContext &CC) : Ctx(CC), PoppedFrame(CC.getStackFrame()) {}
+    CallBack(CheckerContext &CC, bool TopFrame)
+        : Ctx(CC), PoppedFrame(CC.getStackFrame()), TopFrame(TopFrame) {}
 
     bool HandleBinding(StoreManager &SMgr, Store S, const MemRegion *Region,
                        SVal Val) override {
+      updateInvalidatedRegions(Region);
       const MemRegion *VR = Val.getAsRegion();
       if (!VR)
         return true;
@@ -343,7 +444,8 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
         return true;
 
       // Check the globals for the same.
-      if (!isa<GlobalsSpaceRegion>(Region->getMemorySpace()))
+      if (!isa_and_nonnull<GlobalsSpaceRegion>(
+              getStackOrGlobalSpaceRegion(Region)))
         return true;
       if (VR && VR->hasStackStorage() && !isNotInCurrentFrame(VR, Ctx))
         V.emplace_back(Region, VR);
@@ -351,7 +453,8 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
     }
   };
 
-  CallBack Cb(Ctx);
+  CallBack Cb(Ctx, ExitingTopFrame);
+  ProgramStateRef State = Node->getState();
   State->getStateManager().getStoreManager().iterBindings(State->getStore(),
                                                           Cb);
 
@@ -359,28 +462,31 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
     return;
 
   // Generate an error node.
-  ExplodedNode *N = Ctx.generateNonFatalErrorNode(State);
+  ExplodedNode *N = Ctx.generateNonFatalErrorNode(State, Node);
   if (!N)
     return;
 
   if (!BT_stackleak)
     BT_stackleak =
         std::make_unique<BugType>(CheckNames[CK_StackAddrEscapeChecker],
-                                  "Stack address stored into global variable");
+                                  "Stack address leaks outside of stack frame");
 
   for (const auto &P : Cb.V) {
     const MemRegion *Referrer = P.first->getBaseRegion();
     const MemRegion *Referred = P.second;
+    if (Cb.InvalidatedRegions.contains(getOriginBaseRegion(Referrer))) {
+      continue;
+    }
 
     // Generate a report for this bug.
     const StringRef CommonSuffix =
-        "upon returning to the caller.  This will be a dangling reference";
+        " upon returning to the caller.  This will be a dangling reference";
     SmallString<128> Buf;
     llvm::raw_svector_ostream Out(Buf);
     const SourceRange Range = genName(Out, Referred, Ctx.getASTContext());
 
     if (isa<CXXTempObjectRegion, CXXLifetimeExtendedObjectRegion>(Referrer)) {
-      Out << " is still referred to by a temporary object on the stack "
+      Out << " is still referred to by a temporary object on the stack"
           << CommonSuffix;
       auto Report =
           std::make_unique<PathSensitiveBugReport>(*BT_stackleak, Out.str(), N);
@@ -390,28 +496,12 @@ void StackAddrEscapeChecker::checkEndFunction(const ReturnStmt *RS,
       return;
     }
 
-    const StringRef ReferrerMemorySpace = [](const MemSpaceRegion *Space) {
-      if (isa<StaticGlobalSpaceRegion>(Space))
-        return "static";
-      if (isa<GlobalsSpaceRegion>(Space))
-        return "global";
-      assert(isa<StackSpaceRegion>(Space));
-      return "stack";
-    }(Referrer->getMemorySpace());
-
-    // We should really only have VarRegions here.
-    // Anything else is really surprising, and we should get notified if such
-    // ever happens.
-    const auto *ReferrerVar = dyn_cast<VarRegion>(Referrer);
-    if (!ReferrerVar) {
-      assert(false && "We should have a VarRegion here");
-      continue; // Defensively skip this one.
+    auto ReferrerVariable = printReferrer(Referrer);
+    if (!ReferrerVariable) {
+      continue;
     }
-    const std::string ReferrerVarName =
-        ReferrerVar->getDecl()->getDeclName().getAsString();
 
-    Out << " is still referred to by the " << ReferrerMemorySpace
-        << " variable '" << ReferrerVarName << "' " << CommonSuffix;
+    Out << " is still referred to by the " << *ReferrerVariable << CommonSuffix;
     auto Report =
         std::make_unique<PathSensitiveBugReport>(*BT_stackleak, Out.str(), N);
     if (Range.isValid())
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index d73dc40cf03fbb..ea916c3585cadc 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -2436,8 +2436,19 @@ PathSensitiveBugReport::getLocation() const {
     if (auto FE = P.getAs<FunctionExitPoint>()) {
       if (const ReturnStmt *RS = FE->getStmt())
         return PathDiagnosticLocation::createBegin(RS, SM, LC);
+
+      // If we are exiting a destructor call, it is more useful to point to the
+      // next stmt which is usually the temporary declaration.
+      // For non-destructor and non-top-level calls, the next stmt will still
+      // refer to the last executed stmt of the body.
+      S = ErrorNode->getNextStmtForDiagnostics();
+      // If next stmt is not found, it is likely the end of a top-level function
+      // analysis. find the last execution statement then.
+      if (!S)
+        S = ErrorNode->getPreviousStmtForDiagnostics();
     }
-    S = ErrorNode->getNextStmtForDiagnostics();
+    if (!S)
+      S = ErrorNode->getNextStmtForDiagnostics();
   }
 
   if (S) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
index f84da769d182f8..11cef00ada330b 100644
--- a/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExplodedGraph.cpp
@@ -376,7 +376,7 @@ const Stmt *ExplodedNode::getNextStmtForDiagnostics() const {
 
 const Stmt *ExplodedNode::getPreviousStmtForDiagnostics() const {
   for (const ExplodedNode *N = getFirstPred(); N; N = N->getFirstPred())
-    if (const Stmt *S = N->getStmtForDiagnostics())
+    if (const Stmt *S = N->getStmtForDiagnostics(); S && !isa<CompoundStmt>(S))
       return S;
 
   return nullptr;
diff --git a/clang/test/Analysis/copy-elision.cpp b/clang/test/Analysis/copy-elision.cpp
index 991f325c05853d..423c4519f5bfc6 100644
--- a/clang/test/Analysis/copy-elision.cpp
+++ b/clang/test/Analysis/copy-elision.cpp
@@ -158,19 +158,19 @@ ClassWithoutDestructor make1(AddressVector<ClassWithoutDestructor> &v) {
   return ClassWithoutDestructor(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithoutDestructor make2(AddressVector<ClassWithoutDestructor> &v) {
   return make1(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithoutDestructor make3(AddressVector<ClassWithoutDestructor> &v) {
   return make2(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithoutDestructor' is still \
-referred to by the stack variable 'v' upon returning to the caller}}
+referred to by the caller variable 'v' upon returning to the caller}}
 }
 
 void testMultipleReturns() {
@@ -193,7 +193,7 @@ void testMultipleReturns() {
 void consume(ClassWithoutDestructor c) {
   c.push();
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'c' is still referred to by the stack variable 'v' upon returning \
+variable 'c' is still referred to by the caller variable 'v' upon returning \
 to the caller}}
 }
 
@@ -267,7 +267,7 @@ struct TestCtorInitializer {
     : c(ClassWithDestructor(v)) {}
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 };
 
 void testCtorInitializer() {
@@ -303,19 +303,19 @@ ClassWithDestructor make1(AddressVector<ClassWithDestructor> &v) {
   return ClassWithDestructor(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithDestructor make2(AddressVector<ClassWithDestructor> &v) {
   return make1(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 ClassWithDestructor make3(AddressVector<ClassWithDestructor> &v) {
   return make2(v);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
 object of type 'ClassWithDestructor' is still referred \
-to by the stack variable 'v' upon returning to the caller}}
+to by the caller variable 'v' upon returning to the caller}}
 }
 
 void testMultipleReturnsWithDestructors() {
@@ -360,7 +360,7 @@ void testMultipleReturnsWithDestructors() {
 void consume(ClassWithDestructor c) {
   c.push();
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'c' is still referred to by the stack variable 'v' upon returning \
+variable 'c' is still referred to by the caller variable 'v' upon returning \
 to the caller}}
 }
 
@@ -407,7 +407,7 @@ struct Foo {
 Foo make1(Foo **r) {
   return Foo(r);
   // no-elide-warning@-1 {{Address of stack memory associated with temporary \
-object of type 'Foo' is still referred to by the stack \
+object of type 'Foo' is still referred to by the caller \
 variable 'z' upon returning to the caller}}
 }
 
diff --git a/clang/test/Analysis/incorrect-checker-names.cpp b/clang/test/Analysis/incorrect-checker-names.cpp
index 9854a503fc4f62..8dee0b6f468581 100644
--- a/clang/test/Analysis/incorrect-checker-names.cpp
+++ b/clang/test/Analysis/incorrect-checker-names.cpp
@@ -16,5 +16,5 @@ char const *p;
 void f0() {
   char const str[] = "This will change";
   p = str;
-} // expected-warning{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference [core.StackAddressEscape]}}
-// expected-note@-1{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference}}
+} // expected-warning@-1{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference [core.StackAddressEscape]}}
+// expected-note@-2{{Address of stack memory associated with local variable 'str' is still referred to by the global variable 'p' upon returning to the caller.  This will be a dangling reference}}
diff --git a/clang/test/Analysis/loop-block-counts.c b/clang/test/Analysis/loop-block-counts.c
index 9af67b1b632f51..ae3705f8bd7ccc 100644
--- a/clang/test/Analysis/loop-block-counts.c
+++ b/clang/test/Analysis/loop-block-counts.c
@@ -6,7 +6,7 @@ void callee(void **p) {
   int x;
   *p = &x;
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'x' is still referred to by the stack variable 'arr' upon \
+variable 'x' is still referred to by the caller variable 'arr' upon \
 returning to the caller}}
 }
 
diff --git a/clang/test/Analysis/stack-addr-ps.c b/clang/test/Analysis/stack-addr-ps.c
index e69ab4189b524f..138b8c16b02bde 100644
--- a/clang/test/Analysis/stack-addr-ps.c
+++ b/clang/test/Analysis/stack-addr-ps.c
@@ -95,3 +95,34 @@ void callTestRegister(void) {
     char buf[20];
     testRegister(buf); // no-warning
 }
+
+void top_level_leaking(int **out) {
+  int local = 42;
+  *out = &local; // expected-warning{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'out'}}
+}
+
+void callee_leaking_via_param(int **out) {
+  int local = 1;
+  *out = &local;
+  // expected-warning@-1{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'ptr'}}
+}
+
+void caller_for_leaking_callee() {
+  int *ptr = 0;
+  callee_leaking_via_param(&ptr);
+}
+
+void callee_nested_leaking(int **out) {
+  int local = 1;
+  *out = &local;
+  // expected-warning@-1{{Address of stack memory associated with local variable 'local' is still referred to by the caller variable 'ptr'}}
+}
+
+void caller_mid_for_nested_leaking(int **mid) {
+  callee_nested_leaking(mid);
+}
+
+void caller_for_nested_leaking() {
+  int *ptr = 0;
+  caller_mid_for_nested_leaking(&ptr);
+}
diff --git a/clang/test/Analysis/stack-addr-ps.cpp b/clang/test/Analysis/stack-addr-ps.cpp
index bd856be2b8d690..b98fc3beade761 100644
--- a/clang/test/Analysis/stack-addr-ps.cpp
+++ b/clang/test/Analysis/stack-addr-ps.cpp
@@ -140,7 +140,7 @@ void write_stack_address_to(char **q) {
   char local;
   *q = &local;
   // expected-warning@-1 {{Address of stack memory associated with local \
-variable 'local' is still referred to by the stack variable 'p' upon \
+variable 'local' is still referred to by the caller variable 'p' upon \
 returning to the caller}}
 }
 
@@ -161,3 +161,610 @@ C make1() {
 void test_copy_elision() {
   C c1 = make1();
 }
+
+namespace leaking_via_direct_pointer {
+void* returned_direct_pointer_top() {
+  int local = 42;
+  int* p = &local;
+  return p; // expected-warning{{associated with local variable 'local' returned}}
+}
+
+int* returned_direct_pointer_callee() {
+  int local = 42;
+  int* p = &local;
+  return p; // expected-warning{{associated with local variable 'local' returned}}
+}
+...
[truncated]

@steakhal
Copy link
Contributor

Review commit by commit.

@necto necto marked this pull request as draft August 22, 2024 12:43
@necto
Copy link
Contributor Author

necto commented Aug 22, 2024

It turns out that you cannot rebase&merge in llvm-project repo, so I'll create two more PRs stacked PRs - one per commit

@necto necto force-pushed the az/CPP-4734-stack-leak-out-param branch from 6635877 to daec8e8 Compare August 22, 2024 12:46
@steakhal steakhal changed the title [analyzer] Detect leaks of stack addresses via output params, indirect globals [analyzer] Detect leaks of stack addresses via output params, indirect globals 3/3 Aug 22, 2024
@steakhal
Copy link
Contributor

I already reviewed these three. LGTM

@necto
Copy link
Contributor Author

necto commented Aug 22, 2024

It turns out that you cannot rebase&merge in llvm-project repo, so I'll create two more PRs stacked PRs - one per commit

Here are the two PRs that promote the first two commits of this branch: #105652 and #105653

@NagyDonat
Copy link
Contributor

Please ping me when this commit is in a clean state that can be reviewed (e.g. updates on earlier commits are incorporated). Thanks!

steakhal pushed a commit that referenced this pull request Aug 26, 2024
…#105652)

These tests and refactoring are preparatory for the upcoming changes:
detection of the indirect leak via global variables and output
parameters.

CPP-4734

-------

This is the first of three commits constituting
#105648
@necto necto force-pushed the az/CPP-4734-stack-leak-out-param branch from 4081a03 to 153ad98 Compare August 26, 2024 12:19
steakhal pushed a commit that referenced this pull request Aug 26, 2024
#105653)

At this point, only functions called from other functions (i.e., not
top-level) are covered. Top-level functions have a different exit
sequence and will be handled by a subsequent change.

CPP-4734

-------

This is the second of three commits constituting
#105648
it must not be merged before
#105652
Fix some false negatives of StackAddrEscapeChecker:
- Output parameters
  ```
  void top(int **out) {
    int local = 42;
    *out = &local; // Noncompliant
  }
  ```
- Indirect global pointers
  ```
  int **global;

  void top() {
    int local = 42;
    *global = &local; // Noncompliant
  }
  ```

Note that now StackAddrEscapeChecker produces a diagnostic if a function
with an output parameter is analyzed as top-level or as a callee. I took
special care to make sure the reports point to the same primary location
and, in many cases, feature the same primary message. That is the
motivation to modify Core/BugReporter.cpp and Core/ExplodedGraph.cpp

To avoid false positive reports when a global indirect pointer is
assigned a local address, invalidated, and then reset, I rely on the
fact that the invalidation symbol will be a DerivedSymbol of a
ConjuredSymbol that refers to the same memory region.

The checker still has a false negative for non-trivial escaping via a
returned value. It requires a more sophisticated traversal akin to
scanReachableSymbols, which out of the scope of this change.

CPP-4734
@necto necto force-pushed the az/CPP-4734-stack-leak-out-param branch from 153ad98 to 991f176 Compare August 26, 2024 12:41
@necto necto marked this pull request as ready for review August 26, 2024 12:42
@necto
Copy link
Contributor Author

necto commented Aug 26, 2024

Please ping me when this commit is in a clean state that can be reviewed (e.g. updates on earlier commits are incorporated). Thanks!

@NagyDonat , the earlier commits are now merged and I rebased this PR. Feel free to have a look

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the commit once and I have a serious question about using getOriginRegion in the invalidation logic (see inline comment for details). Apart from that, the code LGTM (I might return later with a few additional comments from a more through review, but I don't except anything important).

@necto necto requested a review from NagyDonat August 27, 2024 07:15
Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, the change LGTM now.

Copy link

github-actions bot commented Aug 27, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

@steakhal steakhal merged commit 190449a into llvm:main Aug 28, 2024
8 checks passed
@mikaelholmen
Copy link
Collaborator

Hello,

The following starts crashing with this patch:

clang -cc1 -analyze -analyzer-checker=core bbi-98571.c

Result:

bbi-98571.c:2:14: warning: expected ';' at end of declaration list
    2 |   struct a *b
      |              ^
      |              ;
bbi-98571.c:5:8: warning: expected ';' at end of declaration list
    5 |   int d
      |        ^
      |        ;
bbi-98571.c:11:4: warning: passing arguments to 'f' without a prototype is deprecated in all versions of C and is not supported in C23 [-Wdeprecated-non-prototype]
   11 |   f(h);
      |    ^
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: build-all/bin/clang -cc1 -analyze -analyzer-checker=core bbi-98571.c
1.	<eof> parser at end of file
2.	While analyzing stack: 
	#0 Calling g
 #0 0x0000555c2d234d97 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all/bin/clang+0x7f7cd97)
 #1 0x0000555c2d2328fe llvm::sys::RunSignalHandlers() (build-all/bin/clang+0x7f7a8fe)
 #2 0x0000555c2d23545f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f49e8845cf0 __restore_rt (/lib64/libpthread.so.0+0x12cf0)
 #4 0x0000555c2f3124a8 clang::ento::MemRegion::getBaseRegion() const (build-all/bin/clang+0xa05a4a8)
 #5 0x0000555c2f0f5d82 (anonymous namespace)::StackAddrEscapeChecker::checkEndFunction(clang::ReturnStmt const*, clang::ento::CheckerContext&) const::CallBack::HandleBinding(clang::ento::StoreManager&, void const*, clang::ento::MemRegion const*, clang::ento::SVal) StackAddrEscapeChecker.cpp:0:0
 #6 0x0000555c2f351937 (anonymous namespace)::RegionStoreManager::iterBindings(void const*, clang::ento::StoreManager::BindingsHandler&) RegionStore.cpp:0:0
 #7 0x0000555c2f0f51c5 void clang::ento::check::EndFunction::_checkEndFunction<(anonymous namespace)::StackAddrEscapeChecker>(void*, clang::ReturnStmt const*, clang::ento::CheckerContext&) StackAddrEscapeChecker.cpp:0:0
 #8 0x0000555c2f29ee6a clang::ento::CheckerManager::runCheckersForEndFunction(clang::ento::NodeBuilderContext&, clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNode*, clang::ento::ExprEngine&, clang::ReturnStmt const*) (build-all/bin/clang+0x9fe6e6a)
 #9 0x0000555c2f2d3c96 clang::ento::ExprEngine::processEndOfFunction(clang::ento::NodeBuilderContext&, clang::ento::ExplodedNode*, clang::ReturnStmt const*) (build-all/bin/clang+0xa01bc96)
#10 0x0000555c2f2a8e63 clang::ento::CoreEngine::HandleBlockEdge(clang::BlockEdge const&, clang::ento::ExplodedNode*) (build-all/bin/clang+0x9ff0e63)
#11 0x0000555c2f2a870a clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) (build-all/bin/clang+0x9ff070a)
#12 0x0000555c2f2a801a clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (build-all/bin/clang+0x9ff001a)
#13 0x0000555c2ee23955 (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) AnalysisConsumer.cpp:0:0
#14 0x0000555c2edfc78b (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) AnalysisConsumer.cpp:0:0
#15 0x0000555c2f3a7ea7 clang::ParseAST(clang::Sema&, bool, bool) (build-all/bin/clang+0xa0efea7)
#16 0x0000555c2def1e80 clang::FrontendAction::Execute() (build-all/bin/clang+0x8c39e80)
#17 0x0000555c2de5e5bf clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (build-all/bin/clang+0x8ba65bf)
#18 0x0000555c2dfdfa8e clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (build-all/bin/clang+0x8d27a8e)
#19 0x0000555c2aaa0576 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (build-all/bin/clang+0x57e8576)
#20 0x0000555c2aa9cd1d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#21 0x0000555c2aa9ba64 clang_main(int, char**, llvm::ToolContext const&) (build-all/bin/clang+0x57e3a64)
#22 0x0000555c2aaad347 main (build-all/bin/clang+0x57f5347)
#23 0x00007f49e63ead85 __libc_start_main (/lib64/libc.so.6+0x3ad85)
#24 0x0000555c2aa9a62e _start (build-all/bin/clang+0x57e262e)
Segmentation fault (core dumped)

bbi-98571.c.gz

@necto necto deleted the az/CPP-4734-stack-leak-out-param branch August 29, 2024 14:23
necto added a commit to necto/llvm-project that referenced this pull request Aug 29, 2024
As reported in llvm#105648 (comment)
commit 08ad8dc7154bf3ab79f750e6d5fb7df597c7601a
introduced a nullptr dereference in the case when store contains a binding
to a symbol that has no origin region associated with it,
such as the symbol generated when a pointer is passed to an opaque function.
@necto
Copy link
Contributor Author

necto commented Aug 29, 2024

Hello,

The following starts crashing with this patch:

clang -cc1 -analyze -analyzer-checker=core bbi-98571.c

Result:

(...)

Thank you for the report!
Here is the fix: #106568

steakhal pushed a commit that referenced this pull request Aug 29, 2024
…ion (#106568)

As reported in
#105648 (comment)
commit 08ad8dc7154bf3ab79f750e6d5fb7df597c7601a
introduced a nullptr dereference in the case when store contains a
binding to a symbol that has no origin region associated with it, such
as the symbol generated when a pointer is passed to an opaque function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants