-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[clang] Warn on mismatched RequiresCapability attributes #67520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang-analysis @llvm/pr-subscribers-clang ChangesThere's a problem with this attribute where the declaration in a header file doesn't have the attribute, but the definition in the source file has. As a result, the attribute doesn't take effect when just the header file is included. This is a WIP patch to address this problem and emit a warning for it. I'm not sure if this is the best place to do it. As discussed with @AaronBallman on Discord, it can't be done in Full diff: https://github.com/llvm/llvm-project/pull/67520.diff 1 Files Affected:
diff --git a/clang/lib/Parse/Parser.cpp b/clang/lib/Parse/Parser.cpp
index dda1dbaa0c21aa9..befdb43ca4c00ee 100644
--- a/clang/lib/Parse/Parser.cpp
+++ b/clang/lib/Parse/Parser.cpp
@@ -1484,6 +1484,18 @@ Decl *Parser::ParseFunctionDefinition(ParsingDeclarator &D,
if (LateParsedAttrs)
ParseLexedAttributeList(*LateParsedAttrs, Res, false, true);
+ if (Res) {
+ for (const auto *FD : Res->redecls()) {
+ if (FD == Res)
+ continue;
+ if (Res->hasAttr<RequiresCapabilityAttr>() &&
+ !FD->hasAttr<RequiresCapabilityAttr>()) {
+ // Definition has attribute, but the declaration doesn't.
+ llvm::errs() << "AHA!\n";
+ }
+ }
+ }
+
return ParseFunctionStatementBody(Res, BodyScope);
}
|
Ping |
Errr, that's the behavior of basically all inheritable attributes -- the design idea there is that the attribute is additive information that's only visible from the declaration with the attribute onward. So I think some users will appreciate this warning because they might not have been aware that the attribute was not really doing much for them, but I also think some users will be frustrated because they've done things like:
I could imagine this being useful for @aaronpuchert -- do you have thoughts here? |
Yeah, it's a tricky question. On one hand there have been previous issues like #42194 (which this would basically address), and it would definitely improve consistency and prevent beginner's mistakes. On the other hand it seems useful to allow adding attributes e.g. to third-party libraries where you don't control the headers. Perhaps we should introduce such checks, but put them under a subflag of |
@AaronBallman Is the prototype of the implementation here at least sound? Or should this go somewhere completely different? Maybe in one of the TSA source files? |
This is a semantic concern rather than a syntactic one, so I would expect this to happen in Sema rather than in Parser. My intuition is that this would be most natural in CC @erichkeane in case he has opinions. |
I've resorted to do this in From a diagnostic POV, this seems quite complicated since we have N declarations and need to report differences in M attributes (and each one of them can have >=1 parameter...). For the POC implementation here, I've left if with just reporting a mismatch in attributes. For example with error: 'expected-warning' diagnostics seen but not expected:
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 2171: attribute mismatch between function definition and declaration of 'foo2'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 2172: attribute mismatch between function definition and declaration of 'foo3'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 2222: attribute mismatch between function definition and declaration of 'fooF2'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 3182: attribute mismatch between function definition and declaration of 'foo1'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 3183: attribute mismatch between function definition and declaration of 'foo2'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 5220: attribute mismatch between function definition and declaration of 'test'
File /home/tbaeder/code/llvm-project/clang/test/SemaCXX/warn-thread-safety-analysis.cpp Line 5221: attribute mismatch between function definition and declaration of 'test2'
|
I tried out this WIP on our codebase after we ran into a problem where an annotation was added to a definition rather than a declaration, leading to a (silently) missed capability check. This addition seems to work fine, but it triggers warnings when annotations are added to both the definition and declaration. This can be seen here: bitcoin/bitcoin#30316 I went ahead and PR'd the removals because they're straightforward enough, but it seems (to me) that those these are all false positives. If the same annotation is listed in the declaration and definition, should it really warn? |
No I don't think it should, at least that was not the intention of this patch. As it is right now, the code should warn for attributes listed in the definition, but now in the declaration(s). |
…ion definitions 5729dbb refactor: remove extraneous lock annotations from function definitions (Cory Fields) Pull request description: These annotations belong in the declarations rather than the definitions. While harmless now, future versions of clang may warn about these. Discovered these using the upstream WIP: llvm/llvm-project#67520 ACKs for top commit: instagibbs: ACK 5729dbb maflcko: ACK 5729dbb 🦋 Tree-SHA512: c82c6b269dd353b140cbb36b5519ab2637e54034f159d6ad3eb78c6f4019aa053a5973c626395f0ed3366b9f4117ecc4fe7926b83e9714b1e21c97d5e4bed8d7
Doesn't this fit better in The analysis in |
I'm the person who asked @tbaederr last year if this could be added for the benefit of QEMU, and as I'm looking at it again now, I thought maybe I should leave a comment here. In QEMU, we currently handle the problem with a coding convention (public functions get TSA attributes only in the header) and rely on manual code review that the convention is actually obeyed. Obviously, having something automatable like this handled by the compiler would be much nicer and less error prone, so we would still like to see this implemented eventually in some form.
Not sure what scenario you have in mind there. How is requiring a capability ever something internal? Aren't required capabilities by definition part of the contract of a function? For example, if a function uses RequiresCapability to specify that it must be called with some lock held and you call it without that lock, that is a bug. If the function takes a lock only internally, then it wouldn't specify that in a TSA attribute because these attributes are only about the interface of the function. I would actually compare this to function declarations with conflicting calling conventions. This is always a hard compiler error – calling the function with a different calling convention would be a bug, just like calling a function without the right capabilities. (Though admittedly, differing calling conventions would typically result in more obvious bugs if the compiler didn't catch them...) Am I missing some legitimate scenario in which your code snippet shouldn't cause a warning here because I'm primarily thinking of capabilities as used with locks?
I'm not sure that this mistake is one that only beginners would make. But I agree that adding attributes to functions from third-party libraries is important. I think this is why I first suggested the following in the downstream tracker: "I propose that clang should warn if a function definition has TSA attributes, a previous declaration of the same function exists and the attribute isn't present on at least one of the previous declarations." But come to think of it, I'm not sure any more if this really makes a difference – typically, you don't even have a definition for functions from third-party libraries that could trigger the warning, only declarations. The exception are inline functions in headers, but even then, what you get is a declaration with the attribute and a definition without it, which still wouldn't trigger the warning. So do we actually have a problem here? |
Sure, don't read too much into that statement. But as a rule of thumb, static analysis attributes always belong on the publicly visible declaration. Otherwise, the caller can't see them. This is true not only for thread safety attributes, but also nullability attributes, handle attributes, type state attributes, and so on. I'm not sure how they're handling it. For
Ok, that's an interesting twist. I could live with that, but I'm not sure if we handle any other atttributes this way.
What I was thinking about was warning on any redeclaration that's adding attributes. Sure, if you only do this on definitions, adding attributes to third-party library wouldn't be an issue. |
Yes, the theory is obvious and easy to understand, at least once you think of it. Applying it in practice is where I made the experience that mistakes happen a bit too easily. When you have a mix of static and public functions (so sometimes the annotations are on the definition and sometimes in the header), you're bound to miss a missing "static" sooner or later even if you know the theory. Being careful only gets you so far, so it would be really nice to get that support from the compiler. (And after all, TSA is all about getting support from the compiler instead of trying to be careful yourself.) Another way to deal with the mistakes would be just outright forbidding most TSA attributes on definitions for public functions, though that's even more restrictive ("TSA attributes on the definition have to be empty" vs. "TSA attributes on the definition have to be a subset of the attributes on declarations"). What I think would make me prefer having the annotations in both places is that when editing a source file, it is slightly inconvenient to have to look at the header to know which assumptions it can make, and it's also slightly inconvenient to update a function in both places if you can't just copy the line, but they have to be different. But that's merely a question of convenience, so if forbidding works better for you, I could live with that, too.
Yeah, I'm not sure if there are any that work the exact same way (though I also didn't actively look for one). But as @AaronBallman said above, the attributes are additive, so comparing the definition with the final set of TSA annotations of all declarations combined – which is the same as checking for "at least one" – would seem to make a lot of sense to me. But as I concluded, checking all previous declarations for function definitions (and only for definitions) should actually work fine in practice, too, unless you're doing crazy things like writing your own implementation with TSA annotations while using a third-party header file for it that doesn't have them. In which case you should probably just copy that header file instead, or locally turn off the warning for that function. I also compared TSA annotations to calling conventions. There it's possible to have the attribute only on the first declaration and leave it out on later ones (but not the other way around, and obviously also not specifying two different calling conventions). Modelling it after that should work, too, though I'm not sure if the consistency with another attribute is worth the additional limitation. So there are options and the exact mode is for you guys to decide, I don't mind too much either way.
Ah, I see. Maybe I misunderstood the code of this PR, but I thought it only checks definitions. And that's honestly all I think we need. It will always be possible to fool TSA with declarations that one file sees and another doesn't, but if we can warn in the common case of a function with one declaration in a header and a definition in a specific source file, it would probably catch 99% of cases in practice. |
That's my understanding as well. I was just thinking about the usual requirement of putting the attribute on the first declaration (like the standard mandates it for The idea behind the standard requirement (to my understanding) is that the attribute set doesn't "change" throughout the compilation unit. Of course you have to put any uses after the first declaration, and with that requirement you can analyze uses directly after parsing them without getting a different result from doing it at the end, with all attributes accumulated. That being said, I'm fine with the idea. I just think this should live in Sema instead of Analysis. That's where we're doing semantic checks of the attributes themselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably fits better in Sema/SemaDeclAttr.cpp
. Maybe checkLockFunAttrCommon
is a good place. Add a test to clang/test/Sema/attr-capabilities.c{,pp}
.
clang/lib/Analysis/ThreadSafety.cpp
Outdated
|
||
for (const FunctionDecl *D = FD->getPreviousDecl(); D; | ||
D = D->getPreviousDecl()) { | ||
auto DArgs = collectAttrArgs<RequiresCapabilityAttr>(D); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we should restrict this to requires_capability
. I think pretty much all attributes should be visible to callers except no_thread_safety_analysis
. This is also documented:
Q. Should I put attributes in the header file, or in the .cc/.cpp/.cxx file?
(A) Attributes are part of the formal interface of a function, and should always go in the header, where they are visible to anything that includes the header. Attributes in the .cpp file are not visible outside of the immediate translation unit, which leads to false negatives and false positives.
And later:
Unlike the other attributes,
NO_THREAD_SAFETY_ANALYSIS
is not part of the interface of a function, and should thus be placed on the function definition (in the.cc
or.cpp
file) rather than on the function declaration (in the header).
Rather than this being "not added in the header file", should we just make this one of the attributes that is disallowed after the thing has been 'referenced'? Or is that a dumb suggestion here? |
That might not be enough. A function might not be used (or even referenced) in the TU that defines it, but only in other TUs. But it would certainly catch a number of issues already. |
Right, though catching that ends up being pretty impossible. The most you could do is determine that the declaration in the defining TU doesn't have it, but not that other forward declarations not visible could catch it. Unfortunately, unless an attribute contributes to mangling (which, should we consider that?) there is no real way to catch every case. Perhaps we could make this a non-inheritable attribute as well, which should catch at least the "added in future declarations". I think we in a few places we actually DO disallow adding attributes to not-the-first-declaration (which could be done in 'merge', but even just SemaDeclAttr could figure that out), but its not particularly perfect. So in summary: -We can't catch cross-TU issues here unless we make |
Correct, these are all just heuristics. Functions are not typically "forward-declared", so they often have a canonical public declaration. But it is of course possible to declare functions in multiple header files.
No, I think we don't want mangling. One of the design goals of Thread Safety Analysis was that the attributes should be "erasable", so that they can be dropped for compilers that don't understand them. (See for example https://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/42958.pdf.) |
Updated the branch. This now warns: #define LOCKABLE __attribute__ ((lockable))
#define EXCLUSIVE_LOCKS_REQUIRED(...) __attribute__ ((exclusive_locks_required(__VA_ARGS__)))
struct LOCKABLE Mutex {};
struct Mutex mu1;
struct Mutex mu2;
int Foo_fun1(int i)
// EXCLUSIVE_LOCKS_REQUIRED(mu2)
;
void callsFoo() {
int a = 10;
Foo_fun1(a);
}
int Foo_fun1(int i)
EXCLUSIVE_LOCKS_REQUIRED(mu1)
{
return i;
} ./array.cpp:33:5: warning: attribute mismatch between function declarations of 'Foo_fun1' [-Wthread-safety-attributes]
33 | int Foo_fun1(int i)
| ^
1 warning generated. I started collecting the attribute arguments in a |
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the semantics here are: if there is any requires_capability
attribute on a function, it needs to exactly match the set of requires_capability
attributes on every previous declaration? Or in other words: the attributes are either inherited or need to be exactly repeated? To me that sounds fine, just wanted to clarify it.
It seems that inheritance results in duplicated attributes, but we eliminate those duplicates in ThreadSafetyAnalyzer::getMutexIDs
based on CapabilityExpr::equals
. If we want to allow repetition, we should probably compare in the same way. Which means we probably need to move this back into ThreadSafety.cpp
, even though it's an attribute warning. (I don't think we want to use Thread Safety internals in Sema.)
Yes, I think that makes sense. The attributes should match for all (visible) declarations. If the definiton has an attribute too, it should also match. |
Moved it back into |
This example from the failing test should warn I think: class Foo {
public:
void foo1();
void foo2();
void foo3(Foo *other);
template<class T>
void fooT1(const T& dummy1);
template<class T>
void fooT2(const T& dummy2) EXCLUSIVE_LOCKS_REQUIRED(mu_);
Mutex mu_;
int a GUARDED_BY(mu_);
};
void Foo::foo2() EXCLUSIVE_LOCKS_REQUIRED(mu_) {
a = 2;
} Since the definition of |
Yes, this looks exactly like the thing we want to protect against to me (because chances are that the declaration is actually in a header file and other source files will check their calls only against this declaration, so real bugs can be expected to result from this). |
Hmm, so this is causing problems: namespace DoubleLockBug {
class Foo {
public:
Mutex mu_;
int a GUARDED_BY(mu_);
void foo1() EXCLUSIVE_LOCKS_REQUIRED(mu_);
};
void Foo::foo1() EXCLUSIVE_LOCKS_REQUIRED(mu_) {
a = 0;
}
} The AST is:
So the definition has |
The existing code uses |
ae3f652
to
1b17e07
Compare
Meh, adding the attribute to the definition and declaration of a member function results in duplicated diagnostics: |
We don't deduplicate "requires" attributes but check them directly. The result is not ideal, but this is slightly more efficient, and the duplicate warning is probably not a big issue. Especially since usually the attributes are not repeated. |
c6f184e
to
8fe0b75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought a bit more about what inheritance means for this and I think it makes our job easier. We can probably collect capabilities on the first declaration and then just check that the non-inherited attributes on subsequent declarations are already contained in that set.
Also, this should be extended to almost all other thread safety attributes. For a start, do one or two so that we can see what it looks like.
@@ -4054,7 +4054,9 @@ def warn_acquired_before : Warning< | |||
def warn_acquired_before_after_cycle : Warning< | |||
"cycle in acquired_before/after dependencies, starting with '%0'">, | |||
InGroup<ThreadSafetyAnalysis>, DefaultIgnore; | |||
|
|||
def warn_attribute_mismatch : Warning< | |||
"attribute mismatch between function declarations of %0">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since attributes are inherited, the mismatch is inherently one-sided. The declaration that we warn on has some attribute that the previous declaration didn't have. So maybe we can be more specific and say something like "thread safety attribute on function declaration is missing on first declaration".
Also, I think we should group it with the other ThreadSafetyAttributes
warnings.
clang/lib/Analysis/ThreadSafety.cpp
Outdated
for (const auto &[A, B] : zip_longest(NDArgs, DArgs)) { | ||
if (!A || !B || !A->equals(*B)) | ||
Handler.handleAttributeMismatch(ND, cast<NamedDecl>(D)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we inherit the attributes and deduplicate capabilities here via push_back_nodup
, this seems to be equivalent to just comparing sizes. We would only have to go through the vectors if we actually wanted to find the missing attribute.
Equivalently, we could also collect the capabilities from attributes on the first declaration, and then on subsequent declarations check non-inherited attributes against the capability set of the first declaration. That would also make it easy to point out which capability is missing on the first declaration. So we can print a better diagnostic.
This ends up being pretty ugly since we're collecting the arguments right now, but most of the other attributes don't have arguments or have no |
Maybe a single-pass algorithm would be better in the end: we find the first declaration, collect all attributes into a variety of of We already have such a collection in |
There's a problem with this attribute where the declaration in a header file doesn't have the attribute, but the definition in the source file has. As a result, the attribute doesn't take effect when just the header file is included.
This is a WIP patch to address this problem and emit a warning for it. I'm not sure if this is the best place to do it. As discussed with @AaronBallman on Discord, it can't be done in
Sema::MergeFunctionDecl()
, because this is a late parsed attribute and they aren't available yet when we merge the decl.