Skip to content

[Clang] Fix potential null pointer dereferences in Sema::AddInitializerToDecl #94368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

smanna12
Copy link
Contributor

@smanna12 smanna12 commented Jun 4, 2024

This patch

  • Eliminates unnecessary null check for initializer before WebAssembly table type check.
  • Adds assert to confirm non-null initializer after assignment.

…erToDecl

This patch adds null check for 'Init' before dereferencing it to prevent
potential null pointer dereferences reported by static Analyzer tool in
the function.
@smanna12 smanna12 requested a review from tahonermann June 4, 2024 15:37
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jun 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2024

@llvm/pr-subscribers-clang

Author: None (smanna12)

Changes

This patch adds null check for 'Init' before dereferencing it to prevent potential null pointer dereferences reported by static Analyzer tool in the function.


Full diff: https://github.com/llvm/llvm-project/pull/94368.diff

1 Files Affected:

  • (modified) clang/lib/Sema/SemaDecl.cpp (+1-1)
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 34e46e12859bb..cd50df646b8b2 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -13728,7 +13728,7 @@ void Sema::AddInitializerToDecl(Decl *RealDecl, Expr *Init, bool DirectInit) {
     // paths through the function. This should be revisited if
     // -Wrepeated-use-of-weak is made flow-sensitive.
     if (FunctionScopeInfo *FSI = getCurFunction())
-      if ((VDecl->getType().getObjCLifetime() == Qualifiers::OCL_Strong ||
+      if (Init && (VDecl->getType().getObjCLifetime() == Qualifiers::OCL_Strong ||
            VDecl->getType().isNonWeakInMRRWithObjCWeak(Context)) &&
           !Diags.isIgnored(diag::warn_arc_repeated_use_of_weak,
                            Init->getBeginLoc()))

@smanna12 smanna12 changed the title [Clang] Fix potential null pointer dereferences in Sema::AddInitializ… [Clang] Fix potential null pointer dereferences in Sema::AddInitializerToDecl Jun 4, 2024
Copy link

github-actions bot commented Jun 4, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2bc098b8aba089fe8328b3b8a8b6b6816cd5a908 ea34dff36b21d44411aa735aff577c696260c9d6 --extensions cpp -- clang/lib/Sema/SemaDecl.cpp
View the diff from clang-format here.
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 3073e754908..d277457327f 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -13682,7 +13682,8 @@ void Sema::AddInitializerToDecl(Decl *RealDecl, Expr *Init, bool DirectInit) {
     Init = Result.getAs<Expr>();
 
     assert((Init || InitSeq.steps().empty()) &&
-           "Should have a valid initializer or no initialization steps at this point");
+           "Should have a valid initializer or no initialization steps at this "
+           "point");
 
     IsParenListInit = !InitSeq.steps().empty() &&
                       InitSeq.step_begin()->Kind ==

Copy link
Contributor

@feg208 feg208 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add the check here rather than add an assertion at the point of assignation (i.e.

Init = Result.getAs<Expr>();

Since Init is passed into CheckSelfReference on line 13712 where it's dereferenced the same issue binds there as well

@smanna12
Copy link
Contributor Author

smanna12 commented Jun 4, 2024

Why add the check here rather than add an assertion at the point of assignation (i.e.

Init = Result.getAs<Expr>();

Since Init is passed into CheckSelfReference on line 13712 where it's dereferenced the same issue binds there as well

Thanks @feg208 for reviews! I agree with you. I missed the case you mentioned and fixed the patch.

@smanna12 smanna12 requested a review from feg208 June 4, 2024 16:46
Copy link
Contributor

@feg208 feg208 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ltgm

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smanna12. I'm uncertain about this change; see my additional comment.

Comment on lines 13684 to 13690
assert(Init && "Init must not be null");

IsParenListInit = !InitSeq.steps().empty() &&
InitSeq.step_begin()->Kind ==
InitializationSequence::SK_ParenthesizedListInit;
QualType VDeclType = VDecl->getType();
if (Init && !Init->getType().isNull() &&
!Init->getType()->isDependentType() && !VDeclType->isDependentType() &&
if (!Init->getType()->isDependentType() && !VDeclType->isDependentType() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncertain about this change. On the one hand, the assignment to Init looks to me like it must produce a non-null result due to the prior check to Result.isInvalid(). However, the following uses of Init were already guarded by a check for a non-null value, so the static analysis tool should not have complained about those.

Was the static analysis tool perhaps complaining about later uses of Init? Note that the assignment at line 13683 above is conditional (on !VDecl->isInvalidDecl()) and therefore might not suffice to ensure a definite non-null value. I haven't checked exhaustively if that is the case though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the static analysis tool perhaps complaining about later uses of Init?

Yes. FWIW Here is the logic:

if (!VDecl->isInvalidDecl()) {

  ExprResult Result = InitSeq.Perform(*this, Entity, Kind, Args, &DclT);
  if (Result.isInvalid()) {
    return 
  }
  Init = Result.getAs<Expr>();

  if (Init && !Init->getType().isNull() && // verifier expects Init can be null.
}

…

if (!VDecl->isInvalidDecl()) {
  
  .. Init->getBeginLoc()))  // Deref of Init without check
}

@smanna12 smanna12 requested a review from tahonermann August 5, 2024 03:19
Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any test that crashes without this change?
If there are cases where init could in fact be null, and that causes actual issue, we should add a test for it.

An alternative change would be to check for Init there but then remove subsequent, redundant checks of Init being set (again, assuming it does not regress recovery and diagnostics).

In general we should be careful not to add redundant checks just because a static analyzer gets confused, especially if there is the opportunity to do a more thorough cleanup.

Thanks!

@tahonermann
Copy link
Contributor

I don't think the currently proposed change is right for a few reasons.

The proposed change silently marks the declaration as invalid and performs an early return. I don't think this should be a silent behavior. In most, if not all, of the existing early return cases, a diagnostic is issued (perhaps via another called function) and/or a recovery expression is created and used as the initializer.

I think this prior statement of mine is incorrect:

On the one hand, the assignment to Init looks to me like it must produce a non-null result due to the prior check to Result.isInvalid().

It is possible for an ExprResult value to be valid and still hold a null pointer value; InitializationSequence::Perform() explicitly returns a (valid) null pointer in at least one case:

clang/lib/Sema/SemaInit.cpp:
 8612 ExprResult InitializationSequence::Perform(Sema &S,
 8613                                            const InitializedEntity &Entity,
 8614                                            const InitializationKind &Kind,
 8615                                            MultiExprArg Args,
 8616                                            QualType *ResultType) {
 ....
 8699   // No steps means no initialization.
 8700   if (Steps.empty())
 8701     return ExprResult((Expr *)nullptr);
 ....
 9539 }

This indicates that, absent a more complicated analysis that proves that Sema::AddInitializerToDecl() will never call InitializationSequence::Perform() in a context that will lead to a null result, the static analysis tool is right to report a possible null dereference.

I suggest adding an assert here:

clang/lib/Sema/SemaDecl.cpp:
13660   if (!VDecl->isInvalidDecl()) {
.....
13690     ExprResult Result = InitSeq.Perform(*this, Entity, Kind, Args, &DclT);
13691     if (Result.isInvalid()) {
.....
13707       return;
13708     }
13709
13710     Init = Result.getAs<Expr>();
.....
13726   }
    + 
    +   assert(Init && "Should have a valid initializer at this point);
    + 
13728   // Check for self-references within variable initializers.
13729   // Variables declared within a function/method body (except for references)
13730   // are handled by a dataflow analysis.
13731   // This is undefined behavior in C++, but valid in C.
13732   if (getLangOpts().CPlusPlus)
13733     if (!VDecl->hasLocalStorage() || VDecl->getType()->isRecordType() ||
13734         VDecl->getType()->isReferenceType())
13735       CheckSelfReference(*this, RealDecl, Init, DirectInit);

Relatedly, I think the two checks for Init being non-null are probably wrong in these cases:

13506   // WebAssembly tables can't be used to initialise a variable.
13507   if (Init && !Init->getType().isNull() &&
13508       Init->getType()->isWebAssemblyTableType()) {
13509     Diag(Init->getExprLoc(), diag::err_wasm_table_art) << 0;
13510     VDecl->setInvalidDecl();
13511     return;
13512   }
.....
13715     if (Init && !Init->getType().isNull() &&
13716         !Init->getType()->isDependentType() && !VDeclType->isDependentType() &&
13717         Context.getAsIncompleteArrayType(VDeclType) &&
13718         Context.getAsIncompleteArrayType(Init->getType())) {
13719       // Bail out if it is not possible to deduce array size from the
13720       // initializer.
13721       Diag(VDecl->getLocation(), diag::err_typecheck_decl_incomplete_type)
13722           << VDeclType;
13723       VDecl->setInvalidDecl();
13724       return;
13725     }

I think the null check at line 13507 can just be removed; previous uses of Init appear to expect a non-null value. This looks like a case of a reflexive defensive check (added in https://reviews.llvm.org/D139010).

The null check at line 13715 also looks like it was added as a reflexive defensive check (added in https://reviews.llvm.org/D158615 for #37257)

Adding the assert and removing the defensive null checks would eliminate the evidence the static analysis tool is using to justify its report of a potential null dereference.

If Clang tests pass with these changes (using an asserts enabled build of course!), then I think we would be ok to proceed. If we then see failed assertions or null pointer dereferences at some point, we'll be able to identify test cases to add.

@smanna12
Copy link
Contributor Author

smanna12 commented Aug 19, 2024

if (Init && !Init->getType().isNull() &&
13716         !Init->getType()->isDependentType() && !VDeclType->isDependentType() &&
13717         Context.getAsIncompleteArrayType(VDeclType) &&
13718         Context.getAsIncompleteArrayType(Init->getType())) {
13719       // Bail out if it is not possible to deduce array size from the
13720       // initializer.
13721       Diag(VDecl->getLocation(), diag::err_typecheck_decl_incomplete_type)
13722           << VDeclType;
13723       VDecl->setInvalidDecl();
13724       return;
13725     }

The removal of the null check on Init in the code snippet leads to a segmentation fault when Init is null because it attempts to access a member function on a null pointer. The test case test/SemaCXX/paren-list-agg-init.cpp below expects compiler diagnostics for improper array initialization, not a crash. Therefore, the null check is necessary to prevent dereferencing a null pointer and to ensure the code handles cases where the initializer is absent.

int arr6[n](1, 2, 3); // expected-warning {{variable length arrays in C++ are a Clang extension}} \
                           expected-note {{function parameter 'n' with unknown value cannot be used in a constant expression}} \
                           expected-error {{variable-sized object may not be initialized}}

@smanna12 smanna12 marked this pull request as draft August 20, 2024 01:33
@tahonermann
Copy link
Contributor

The removal of the null check on Init in the code snippet leads to a segmentation fault when Init is null because it attempts to access a member function on a null pointer. The test case test/SemaCXX/paren-list-agg-init.cpp below expects compiler diagnostics for improper array initialization, not a crash. Therefore, the null check is necessary to prevent dereferencing a null pointer and to ensure the code handles cases where the initializer is absent.

int arr6[n](1, 2, 3); // expected-warning {{variable length arrays in C++ are a Clang extension}} \
                           expected-note {{function parameter 'n' with unknown value cannot be used in a constant expression}} \
                           expected-error {{variable-sized object may not be initialized}}

Thanks, @smanna12. That is interesting as it implies that the call to InitializationSequence::Perform() did indeed return a result that was valid but null. Would you be able to debug to find out where that null result is coming from? Perhaps here:

clang/lib/Sema/SemaInit.cpp:
 7486 ExprResult InitializationSequence::Perform(Sema &S,
 7487                                            const InitializedEntity &Entity,
 7488                                            const InitializationKind &Kind,
 7489                                            MultiExprArg Args,
 7490                                            QualType *ResultType) {
 ....
 7573   // No steps means no initialization.
 7574   if (Steps.empty())
 7575     return ExprResult((Expr *)nullptr);
 ....
 8474 }

@smanna12
Copy link
Contributor Author

smanna12 commented Aug 20, 2024

That is interesting as it implies that the call to InitializationSequence::Perform() did indeed return a result that was valid but null. Would you be able to debug to find out where that null result is coming from? Perhaps here: clang/lib/Sema/SemaInit.cpp:

Thanks @tahonermann for reviews!. Yes, it is coming from InitializationSequence::Perform()
// No steps means no initialization. if (Steps.empty()) return ExprResult((Expr *)nullptr);

@tahonermann
Copy link
Contributor

A new PR has been submitted to address this static analysis concern as @smanna12 is unavailable for the next week and I can't push to her fork. She coordinated with me to do this hand off. I'm now closing this issue on her behalf.

#106235

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants