Skip to content

[clang] Fix the post-filtering heuristic for GSLPointer. #114044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 12, 2024

Conversation

hokein
Copy link
Collaborator

@hokein hokein commented Oct 29, 2024

The lifetime analyzer processes GSL pointers:

  • when encountering a constructor for a gsl::pointer, the analyzer continues traversing the constructor argument, regardless of whether the parameter has a lifetimebound annotation. This aims to catch cases where a GSL pointer is constructed from a GSL owner, either directly (e.g., FooPointer(FooOwner)) or through a chain of GSL pointers (e.g., FooPointer(FooPointer(FooOwner)));
  • When a temporary object is reported in the callback, the analyzer has heuristics to exclude non-owner types, aiming to avoid false positives (like FooPointer(FooPointer())).

In the problematic case (discovered in #112751 (comment)) of return foo.get();:

  • When the analyzer reports the local object foo, the Path is [GslPointerInit, Lifetimebound].
  • The Path goes through pathOnlyHandlesGslPointer and isn’t filtered out by the [[heuristics]](because foo is an owner type), the analyzer treats it as the FooPointer(FooOwner()) scenario, thus triggering a diagnostic.

Filtering out base on the object 'foo' is wrong, because the GSLPointer is constructed from the return result of the foo.get(). The patch fixes this by teaching the heuristic to use the return result (only const GSLOwner& is considered) of the lifetimebound annotated function.

@Xazax-hun
Copy link
Collaborator

Let's copy the relevant part from the other PR into the description of this one to make it more self-contained.

@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from 32d808a to fcd9636 Compare November 4, 2024 08:17
@hokein hokein changed the title [clang] Don't consider the lifetimeboundCall when analyzing the gsl pointer construction. [clang] Fix the post-filtering heuristic for GSLPointer. Nov 4, 2024
Copy link

github-actions bot commented Nov 4, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@hokein
Copy link
Collaborator Author

hokein commented Nov 4, 2024

Figured out a way to fix the false positives while not introducing many false negatives. I think it is ready for review, and please take a look on the new version.

@hokein hokein marked this pull request as ready for review November 4, 2024 08:20
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 4, 2024

@llvm/pr-subscribers-clang

Author: Haojian Wu (hokein)

Changes

The lifetime analyzer processes GSL pointers:

  • when encountering a constructor for a gsl::pointer, the analyzer continues traversing the constructor argument, regardless of whether the parameter has a lifetimebound annotation. This aims to catch cases where a GSL pointer is constructed from a GSL owner, either directly (e.g., FooPointer(FooOwner)) or through a chain of GSL pointers (e.g., FooPointer(FooPointer(FooOwner)));
  • When a temporary object is reported in the callback, the analyzer has heuristics to exclude non-owner types, aiming to avoid false positives (like FooPointer(FooPointer())).

In the problematic case of return foo.get();:

  • When the analyzer reports the local object foo, the Path is [GslPointerInit, Lifetimebound].
  • The Path goes through pathOnlyHandlesGslPointer and isn’t filtered out by the [[heuristics]](because foo is an owner type), the analyzer treats it as the FooPointer(FooOwner()) scenario, thus triggering a diagnostic.

Filtering out base on the object 'foo' is wrong, because the GSLPointer is constructed from the return result of the foo.get(). The patch fixes this by teaching the heuristic to use the return result (only const GSLOwner& is considered) of the lifetimebound annotated function.


Full diff: https://github.com/llvm/llvm-project/pull/114044.diff

3 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/Sema/CheckExprLifetime.cpp (+92-21)
  • (modified) clang/test/Sema/warn-lifetime-analysis-nocfg.cpp (+45-3)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 1372e49dfac03c..d3f346eb71e951 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -464,6 +464,8 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses ``[[deprecated]]`` attribute usage on local variables (#GH90073).
 
+- Fix false positives when `[[gsl::Owner/Pointer]]` and `[[clang::lifetimebound]]` are used together.
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/lib/Sema/CheckExprLifetime.cpp b/clang/lib/Sema/CheckExprLifetime.cpp
index a1a402b4a2b530..d1e8cc9f9b075c 100644
--- a/clang/lib/Sema/CheckExprLifetime.cpp
+++ b/clang/lib/Sema/CheckExprLifetime.cpp
@@ -1093,6 +1093,87 @@ static bool pathOnlyHandlesGslPointer(const IndirectLocalPath &Path) {
   }
   return false;
 }
+// Result of analyzing the Path for GSLPointer.
+enum AnalysisResult {
+  // Path does not correspond to a GSLPointer.
+  NotGSLPointer,
+
+  // A relevant case was identified.
+  Report,
+  // Stop the entire traversal.
+  Abandon,
+  // Skip this step and continue traversing inner AST nodes.
+  Skip,
+};
+// Analyze cases where a GSLPointer is initialized or assigned from a
+// temporary owner object.
+static AnalysisResult analyzePathForGSLPointer(const IndirectLocalPath &Path,
+                                               Local L) {
+  if (!pathOnlyHandlesGslPointer(Path))
+    return NotGSLPointer;
+
+  // At this point, Path represents a series of operations involving a
+  // GSLPointer, either in the process of initialization or assignment.
+
+  // Note: A LifetimeBoundCall can appear interleaved in this sequence.
+  // For example:
+  //    const std::string& Ref(const std::string& a [[clang::lifetimebound]]);
+  //    string_view abc = Ref(std::string());
+  // The "Path" is [GSLPointerInit, LifetimeboundCall], where "L" is the
+  // temporary "std::string()" object. We need to check if the function with the
+  // lifetimebound attribute returns a "owner" type.
+  if (Path.back().Kind == IndirectLocalPathEntry::LifetimeBoundCall) {
+    // The lifetimebound applies to the implicit object parameter of a method.
+    if (const auto *Method = llvm::dyn_cast<CXXMethodDecl>(Path.back().D)) {
+      if (Method->getReturnType()->isReferenceType() &&
+          isRecordWithAttr<OwnerAttr>(
+              Method->getReturnType()->getPointeeType()))
+        return Report;
+      return Abandon;
+    }
+    // The lifetimebound applies to a function parameter.
+    const auto *PD = llvm::dyn_cast<ParmVarDecl>(Path.back().D);
+    if (const auto *FD = llvm::dyn_cast<FunctionDecl>(PD->getDeclContext())) {
+      if (isa<CXXConstructorDecl>(FD)) {
+        // Constructor case: the parameter is annotated with lifetimebound
+        //   e.g., GSLPointer(const S& s [[clang::lifetimebound]])
+        // We still respect this case even the type S is not an owner.
+        return Report;
+      }
+      // For regular functions, check if the return type has an Owner attribute.
+      //   e.g., const GSLOwner& func(const Foo& foo [[clang::lifetimebound]])
+      if (FD->getReturnType()->isReferenceType() &&
+          isRecordWithAttr<OwnerAttr>(FD->getReturnType()->getPointeeType()))
+        return Report;
+    }
+    return Abandon;
+  }
+
+  if (isa<DeclRefExpr>(L)) {
+    // We do not want to follow the references when returning a pointer
+    // originating from a local owner to avoid the following false positive:
+    //   int &p = *localUniquePtr;
+    //   someContainer.add(std::move(localUniquePtr));
+    //   return p;
+    if (!pathContainsInit(Path) && isRecordWithAttr<OwnerAttr>(L->getType()))
+      return Report;
+    return Abandon;
+  }
+
+  // The GSLPointer is from a temporary object.
+  auto *MTE = dyn_cast<MaterializeTemporaryExpr>(L);
+
+  bool IsGslPtrValueFromGslTempOwner =
+      MTE && !MTE->getExtendingDecl() &&
+      isRecordWithAttr<OwnerAttr>(MTE->getType());
+  // Skipping a chain of initializing gsl::Pointer annotated objects.
+  // We are looking only for the final source to find out if it was
+  // a local or temporary owner or the address of a local
+  // variable/param.
+  if (!IsGslPtrValueFromGslTempOwner)
+    return Skip;
+  return Report;
+}
 
 static bool isAssignmentOperatorLifetimeBound(CXXMethodDecl *CMD) {
   if (!CMD)
@@ -1131,27 +1212,17 @@ static void checkExprLifetimeImpl(Sema &SemaRef,
 
     auto *MTE = dyn_cast<MaterializeTemporaryExpr>(L);
 
-    bool IsGslPtrValueFromGslTempOwner = false;
-    if (pathOnlyHandlesGslPointer(Path)) {
-      if (isa<DeclRefExpr>(L)) {
-        // We do not want to follow the references when returning a pointer
-        // originating from a local owner to avoid the following false positive:
-        //   int &p = *localUniquePtr;
-        //   someContainer.add(std::move(localUniquePtr));
-        //   return p;
-        if (pathContainsInit(Path) ||
-            !isRecordWithAttr<OwnerAttr>(L->getType()))
-          return false;
-      } else {
-        IsGslPtrValueFromGslTempOwner =
-            MTE && !MTE->getExtendingDecl() &&
-            isRecordWithAttr<OwnerAttr>(MTE->getType());
-        // Skipping a chain of initializing gsl::Pointer annotated objects.
-        // We are looking only for the final source to find out if it was
-        // a local or temporary owner or the address of a local variable/param.
-        if (!IsGslPtrValueFromGslTempOwner)
-          return true;
-      }
+    bool IsGslPtrValueFromGslTempOwner = true;
+    switch (analyzePathForGSLPointer(Path, L)) {
+    case Abandon:
+       return false;
+    case Skip:
+       return true;
+    case NotGSLPointer:
+      IsGslPtrValueFromGslTempOwner = false;
+      LLVM_FALLTHROUGH;
+    case Report:
+      break;
     }
 
     switch (LK) {
diff --git a/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp b/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
index 6a2af01ea5116c..3b237e99dd3b33 100644
--- a/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+++ b/clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
@@ -727,8 +727,9 @@ struct [[gsl::Pointer]] Span {
 
 // Pointer from Owner<Pointer>
 std::string_view test5() {
-  std::string_view a = StatusOr<std::string_view>().valueLB(); // expected-warning {{object backing the pointer will be dest}}
-  return StatusOr<std::string_view>().valueLB(); // expected-warning {{returning address of local temporary}}
+  // The Owner<Pointer> doesn't own the object which its inner pointer points to.
+  std::string_view a = StatusOr<std::string_view>().valueLB(); // OK
+  return StatusOr<std::string_view>().valueLB(); // OK
 
   // No dangling diagnostics on non-lifetimebound methods.
   std::string_view b = StatusOr<std::string_view>().valueNoLB();
@@ -775,7 +776,7 @@ Span<std::string> test10(StatusOr<std::vector<std::string>> aa) {
 
 // Pointer<Owner>> from Owner<Pointer<Owner>>
 Span<std::string> test11(StatusOr<Span<std::string>> aa) {
-  return aa.valueLB(); // expected-warning {{address of stack memory}}
+  return aa.valueLB(); // OK
   return aa.valueNoLB(); // OK.
 }
 
@@ -793,3 +794,44 @@ void test13() {
 }
 
 } // namespace GH100526
+
+namespace LifetimeboundInterleave {
+
+const std::string& Ref(const std::string& abc [[clang::lifetimebound]]);
+std::string_view test1() {
+  std::string_view t1 = Ref(std::string()); // expected-warning {{object backing}}
+  t1 = Ref(std::string()); // expected-warning {{object backing}}
+  return Ref(std::string()); // expected-warning {{returning address}}
+}
+
+template <typename T>
+struct Foo {
+  const T& get() const [[clang::lifetimebound]];
+  const T& getNoLB() const;
+};
+std::string_view test2(Foo<std::string> r1, Foo<std::string_view> r2) {
+  std::string_view t1 = Foo<std::string>().get(); // expected-warning {{object backing}}
+  t1 = Foo<std::string>().get(); // expected-warning {{object backing}}
+  return r1.get(); // expected-warning {{address of stack}}
+  
+  std::string_view t2 = Foo<std::string_view>().get();
+  t2 = Foo<std::string_view>().get();
+  return r2.get();
+
+  // no warning on no-LB-annotated method.
+  std::string_view t3 = Foo<std::string>().getNoLB(); 
+  t3 = Foo<std::string>().getNoLB(); 
+  return r1.getNoLB(); 
+}
+
+struct Bar {};
+struct [[gsl::Pointer]] Pointer {
+  Pointer(const Bar & bar [[clang::lifetimebound]]);
+};
+Pointer test3(Bar bar) {
+  Pointer p = Pointer(Bar()); // expected-warning {{temporary}}
+  p = Pointer(Bar()); // expected-warning {{object backing}}
+  return bar; // expected-warning {{address of stack}}
+}
+
+} // namespace LifetimeboundInterleave

Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this approach looks promising to me. Some nits inline.

@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from fcd9636 to 9461b3d Compare November 4, 2024 15:56
};
Pointer test3(Bar bar) {
Pointer p = Pointer(Bar()); // expected-warning {{temporary}}
p = Pointer(Bar()); // expected-warning {{object backing}}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a false negative before, this patch also fixes it.

@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from 9461b3d to fc8024d Compare November 5, 2024 12:11
Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Contributor

@usx95 usx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

t3 = TakeStrRef(std::string()); // expected-warning {{object backing}}
return TakeStrRef(std::string()); // expected-warning {{returning address}}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete extra new line

@hokein
Copy link
Collaborator Author

hokein commented Nov 18, 2024

Just discover a new false positive:

namespace std {
template <typename T>
class [[gsl::Pointer]] Iterator2 {
 public:
  using reference = T&;
  Iterator2() {}
  reference operator*() const;
};
}

template <typename T>
class AnySpan {
 public:
  AnySpan() {}
  std::Iterator2<T> begin() const [[clang::lifetimebound]];
};

AnySpan<int> MakeAnySpan();
void s() {
  const int& t2 = *AnySpan<int>().begin();  // false positive, warning.
}

@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from 00c0186 to 85e0870 Compare November 19, 2024 10:38
@hokein
Copy link
Collaborator Author

hokein commented Nov 19, 2024

Just discover a new false positive:

namespace std {
template <typename T>
class [[gsl::Pointer]] Iterator2 {
 public:
  using reference = T&;
  Iterator2() {}
  reference operator*() const;
};
}

template <typename T>
class AnySpan {
 public:
  AnySpan() {}
  std::Iterator2<T> begin() const [[clang::lifetimebound]];
};

AnySpan<int> MakeAnySpan();
void s() {
  const int& t2 = *AnySpan<int>().begin();  // false positive, warning.
}

Fix it in the latest commit 85e0870, it would be nice to take another look at the patch, @Xazax-hun, @usx95.

Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, the latest version LGTM.

Copy link
Contributor

@usx95 usx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments about the new code

Comment on lines +861 to +769
template<typename T>
struct MySpan {
MySpan(const std::vector<T>& v);
using iterator = std::iterator<T>;
iterator begin() const [[clang::lifetimebound]];
};
template <typename T>
typename MySpan<T>::iterator ReturnFirstIt(const MySpan<T>& v [[clang::lifetimebound]]);

void test4() {
std::vector<int> v{1};
// MySpan<T> doesn't own any underlying T objects, the pointee object of
// the MySpan iterator is still alive when the whole span is destroyed, thus
// no diagnostic.
const int& t1 = *MySpan<int>(v).begin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about this.
I feel MySpan should be a gsl::Pointer in this case and we should deal view types similarly.
Doing this only for owner types and that too only implicit this doesn't make sense to me.

For example, this silences the warning on:
const int& t1 = *MySpan<int>(v).begin();
But what about non member functions ? :

iterator getBeginOff(const MySpan<int>& ms [[clang::lifetimebound]]);
const int& t1 = *getBeginOf(MySpan<int>(v));

(Can you also add this test case?)

Copy link
Collaborator Author

@hokein hokein Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this test case already, see the t2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

// constraints, we do not.
const int& t4 = *MySpan<int>(std::vector<int>{}).begin();

auto it1 = MySpan<int>(v).begin(); // expected-warning {{temporary whose address is use}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this is correct behaviour ?
Specially what about when MySpan is also a pointer ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this behavior is expected per the lifetimebound annotation.

Specifically, what about when MySpan is also a pointer?

In this case, whether MySpan is a pointer or not doesn’t really change the situation. The begin() method is annotated with lifetimebound, which conceptually means that the returned iterator is tied to the lifetime of the MySpan object (e.g. the iterator has an internal pointer that refers to this).
It expresses that the iterator becomes invalid once the MySpan object is destroyed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Well if both iterator and MySpan are pointers then we could make it not warn. But I agree that this is expanding the scope of lifetimebound definition to places where it is not possible. Lifetimes of internal subobjects probably has a better solution with rust-like lifetimes/type system.
Thanks.

Copy link
Contributor

@usx95 usx95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// constraints, we do not.
const int& t4 = *MySpan<int>(std::vector<int>{}).begin();

auto it1 = MySpan<int>(v).begin(); // expected-warning {{temporary whose address is use}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Well if both iterator and MySpan are pointers then we could make it not warn. But I agree that this is expanding the scope of lifetimebound definition to places where it is not possible. Lifetimes of internal subobjects probably has a better solution with rust-like lifetimes/type system.
Thanks.

Comment on lines +861 to +769
template<typename T>
struct MySpan {
MySpan(const std::vector<T>& v);
using iterator = std::iterator<T>;
iterator begin() const [[clang::lifetimebound]];
};
template <typename T>
typename MySpan<T>::iterator ReturnFirstIt(const MySpan<T>& v [[clang::lifetimebound]]);

void test4() {
std::vector<int> v{1};
// MySpan<T> doesn't own any underlying T objects, the pointee object of
// the MySpan iterator is still alive when the whole span is destroyed, thus
// no diagnostic.
const int& t1 = *MySpan<int>(v).begin();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from 85e0870 to 38a0f8d Compare December 11, 2024 13:25
@hokein hokein force-pushed the no-lb-interleave-in-gsl branch from 38a0f8d to 80a50d1 Compare December 12, 2024 19:07
@hokein hokein merged commit 33b910c into llvm:main Dec 12, 2024
7 of 8 checks passed
@hokein hokein deleted the no-lb-interleave-in-gsl branch December 12, 2024 19:57
hokein added a commit that referenced this pull request Feb 17, 2025
…127460)

This fixes a false positive caused by #114044.

For `GSLPointer*` types, it's less clear whether the lifetime issue is
about the GSLPointer object itself or the owner it points to. To avoid
false positives, we take a conservative approach in our heuristic.

Fixes #127195

(This will be backported to release 20).
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 18, 2025
…lvm#127460)

This fixes a false positive caused by llvm#114044.

For `GSLPointer*` types, it's less clear whether the lifetime issue is
about the GSLPointer object itself or the owner it points to. To avoid
false positives, we take a conservative approach in our heuristic.

Fixes llvm#127195

(This will be backported to release 20).

(cherry picked from commit 9c49b18)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants