Skip to content

Commit

Permalink
Reland #90786 ([BoundsSafety] Allow 'counted_by' attribute on pointer…
Browse files Browse the repository at this point in the history
…s in structs in C) (#93121)

[BoundsSafety] Reland #93121 Allow 'counted_by' attribute on pointers in structs in C (#93121)

Fixes #92687.

Previously the attribute was only allowed on flexible array members.
This patch patch changes this to also allow the attribute on pointer
fields in structs and also allows late parsing of the attribute in some
contexts.

For example this previously wasn't allowed:

```
struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}
```

Note the attribute is prevented on pointee types where the size isn't
known at compile time. In particular pointee types that are:

* Incomplete (e.g. `void`) and sizeless types
* Function types (e.g. the pointee of a function pointer)
* Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in
the declaration attribute position. For example

```
struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}
```

is now allowed but **only** when passing
`-fexperimental-late-parse-attributes`. The motivation for using late
parsing here is to avoid breaking the data layout of structs in existing
code that want to use the `counted_by` attribute. This patch is the
first use of `LateAttrParseExperimentalExt` in `Attr.td` that was
introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows
the possiblity of writing the attribute in the type attribute position.
For example:

```
struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}
```

However, the attribute in this position is still currently parsed
immediately rather than late parsed. So this will not parse currently:

```
struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}
```

The intention is to lift this restriction in future patches. It has not
been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be
addressed in future patches:

* Make late parsing working with anonymous structs (see
`on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`).
* Allow `counted_by` on more subjects (e.g. parameters, returns types)
when `-fbounds-safety` is enabled.
* Make use of the attribute on pointer types in code gen (e.g. for
`_builtin_dynamic_object_size` and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

** Differences between #93121 and this patch **

* The memory leak that caused #93121 to be reverted (see #92687) should
  now be fixed. See "The Memory Leak".
* The fix to `pragma-attribute-supported-attributes-list.test`
  (originally in cef6387) has been incorporated into this patch.
* A relaxation of counted_by semantics (originally in 112eadd) has been
  incorporated into this patch.
* The assert in `Parser::DistributeCLateParsedAttrs` has been removed
  because that broke downstream code.
* The switch statement in `Parser::ParseLexedCAttribute` has been
  removed in favor of using `Parser::ParseGNUAttributeArgs` which does
  the same thing but is more feature complete.
* The `EnterScope` parameter has been plumbed through
  `Parser::ParseLexedCAttribute` and `Parser::ParseLexedCAttributeList`.
  It currently doesn't do anything but it will be needed in future
  commits.

** The Memory Leak **

The problem was that these lines parsed the attributes but then did nothing to free the memory

```
assert(!getLangOpts().CPlusPlus);
for (auto *LateAttr : LateFieldAttrs)
  ParseLexedCAttribute(*LateAttr);
```

To fix this this a new `Parser::ParseLexedCAttributeList` method has been
added (based on `Parser::ParseLexedAttributeList`) which does the
necessary memory management. The intention is to merge these two
methods together so there is just one implementation in a future patch
(#93263).

A more principled fixed here would be to fix the ownership of the
`LateParsedAttribute` objects. In principle `LateParsedAttrList` should own
its pointers exclusively and be responsible for deallocating them.
Unfortunately this is complicated by `LateParsedAttribute` objects also
being stored in another data structure (`LateParsedDeclarations`) as
can be seen below (`LA` gets stored in two places).

```
      // Handle attributes with arguments that require late parsing.
      LateParsedAttribute *LA =
          new LateParsedAttribute(this, *AttrName, AttrNameLoc);
      LateAttrs->push_back(LA);

      // Attributes in a class are parsed at the end of the class, along
      // with other late-parsed declarations.
      if (!ClassStack.empty() && !LateAttrs->parseSoon())
        getCurrentClass().LateParsedDeclarations.push_back(LA);
```

this means the ownership of LateParsedAttribute objects isn't very
clear.

rdar://125400257
  • Loading branch information
delcypher authored May 24, 2024
1 parent 39d32b2 commit 2918973
Show file tree
Hide file tree
Showing 23 changed files with 1,147 additions and 149 deletions.
21 changes: 20 additions & 1 deletion clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,8 @@ New Compiler Flags

- ``-fexperimental-late-parse-attributes`` enables an experimental feature to
allow late parsing certain attributes in specific contexts where they would
not normally be late parsed.
not normally be late parsed. Currently this allows late parsing the
`counted_by` attribute in C. See `Attribute Changes in Clang`_.

- ``-fseparate-named-sections`` uses separate unique sections for global
symbols in named special sections (i.e. symbols annotated with
Expand Down Expand Up @@ -423,6 +424,24 @@ Attribute Changes in Clang
- The ``clspv_libclc_builtin`` attribute has been added to allow clspv
(`OpenCL-C to Vulkan SPIR-V compiler <https://github.com/google/clspv>`_) to identify functions coming from libclc
(`OpenCL-C builtin library <https://libclc.llvm.org>`_).
- The ``counted_by`` attribute is now allowed on pointers that are members of a
struct in C.

- The ``counted_by`` attribute can now be late parsed in C when
``-fexperimental-late-parse-attributes`` is passed but only when attribute is
used in the declaration attribute position. This allows using the
attribute on existing code where it previously impossible to do so without
re-ordering struct field declarations would break ABI as shown below.

.. code-block:: c
struct BufferTy {
/* Refering to `count` requires late parsing */
char* buffer __counted_by(count);
/* Swapping `buffer` and `count` to avoid late parsing would break ABI */
size_t count;
};
Improvements to Clang's diagnostics
-----------------------------------
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/AST/Type.h
Original file line number Diff line number Diff line change
Expand Up @@ -2515,6 +2515,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
bool isRecordType() const;
bool isClassType() const;
bool isStructureType() const;
bool isStructureTypeWithFlexibleArrayMember() const;
bool isObjCBoxableRecordType() const;
bool isInterfaceType() const;
bool isStructureOrClassType() const;
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -2257,7 +2257,8 @@ def TypeNullUnspecified : TypeAttr {
def CountedBy : DeclOrTypeAttr {
let Spellings = [Clang<"counted_by">];
let Subjects = SubjectList<[Field], ErrorDiag>;
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel">];
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel", 1>];
let LateParsed = LateAttrParseExperimentalExt;
let ParseArgumentsAsUnevaluated = 1;
let Documentation = [CountedByDocs];
let LangOpts = [COnly];
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/DiagnosticGroups.td
Original file line number Diff line number Diff line change
Expand Up @@ -1445,6 +1445,10 @@ def FunctionMultiVersioning

def NoDeref : DiagGroup<"noderef">;

// -fbounds-safety and bounds annotation related warnings
def BoundsSafetyCountedByEltTyUnknownSize :
DiagGroup<"bounds-safety-counted-by-elt-type-unknown-size">;

// A group for cross translation unit static analysis related warnings.
def CrossTU : DiagGroup<"ctu">;

Expand Down
23 changes: 21 additions & 2 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -6544,8 +6544,10 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<

def err_flexible_array_count_not_in_same_struct : Error<
"'counted_by' field %0 isn't within the same struct as the flexible array">;
def err_counted_by_attr_not_on_flexible_array_member : Error<
"'counted_by' only applies to C99 flexible array members">;
def err_counted_by_attr_not_on_ptr_or_flexible_array_member : Error<
"'counted_by' only applies to pointers or C99 flexible array members">;
def err_counted_by_attr_on_array_not_flexible_array_member : Error<
"'counted_by' on arrays only applies to C99 flexible array members">;
def err_counted_by_attr_refer_to_itself : Error<
"'counted_by' cannot refer to the flexible array member %0">;
def err_counted_by_must_be_in_structure : Error<
Expand All @@ -6560,6 +6562,23 @@ def err_counted_by_attr_refer_to_union : Error<
"'counted_by' argument cannot refer to a union member">;
def note_flexible_array_counted_by_attr_field : Note<
"field %0 declared here">;
def err_counted_by_attr_pointee_unknown_size : Error<
"'counted_by' %select{cannot|should not}3 be applied to %select{"
"a pointer with pointee|" // pointer
"an array with element}0" // array
" of unknown size because %1 is %select{"
"an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE
"a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS
"a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION
// CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER
"a struct type with a flexible array member"
"%select{|. This will be an error in a future compiler version}3"
""
"}2">;

def warn_counted_by_attr_elt_type_unknown_size :
Warning<err_counted_by_attr_pointee_unknown_size.Summary>,
InGroup<BoundsSafetyCountedByEltTyUnknownSize>;

let CategoryName = "ARC Semantic Issue" in {

Expand Down
9 changes: 8 additions & 1 deletion clang/include/clang/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -1646,8 +1646,12 @@ class Parser : public CodeCompletionHandler {
void ParseLexedAttributes(ParsingClass &Class);
void ParseLexedAttributeList(LateParsedAttrList &LAs, Decl *D,
bool EnterScope, bool OnDefinition);
void ParseLexedCAttributeList(LateParsedAttrList &LA, bool EnterScope,
ParsedAttributes *OutAttrs = nullptr);
void ParseLexedAttribute(LateParsedAttribute &LA,
bool EnterScope, bool OnDefinition);
void ParseLexedCAttribute(LateParsedAttribute &LA, bool EnterScope,
ParsedAttributes *OutAttrs = nullptr);
void ParseLexedMethodDeclarations(ParsingClass &Class);
void ParseLexedMethodDeclaration(LateParsedMethodDeclaration &LM);
void ParseLexedMethodDefs(ParsingClass &Class);
Expand Down Expand Up @@ -2534,7 +2538,8 @@ class Parser : public CodeCompletionHandler {

void ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback);
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs = nullptr);

DeclGroupPtrTy ParseTopLevelStmtDecl();

Expand Down Expand Up @@ -3113,6 +3118,8 @@ class Parser : public CodeCompletionHandler {
SourceLocation ScopeLoc,
ParsedAttr::Form Form);

void DistributeCLateParsedAttrs(Decl *Dcl, LateParsedAttrList *LateAttrs);

void ParseBoundsAttribute(IdentifierInfo &AttrName,
SourceLocation AttrNameLoc, ParsedAttributes &Attrs,
IdentifierInfo *ScopeName, SourceLocation ScopeLoc,
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -11403,7 +11403,8 @@ class Sema final : public SemaBase {
QualType BuildMatrixType(QualType T, Expr *NumRows, Expr *NumColumns,
SourceLocation AttrLoc);

QualType BuildCountAttributedArrayType(QualType WrappedTy, Expr *CountExpr);
QualType BuildCountAttributedArrayOrPointerType(QualType WrappedTy,
Expr *CountExpr);

QualType BuildAddressSpaceAttr(QualType &T, LangAS ASIdx, Expr *AddrSpace,
SourceLocation AttrLoc);
Expand Down
10 changes: 10 additions & 0 deletions clang/lib/AST/Type.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,16 @@ bool Type::isStructureType() const {
return false;
}

bool Type::isStructureTypeWithFlexibleArrayMember() const {
const auto *RT = getAs<RecordType>();
if (!RT)
return false;
const auto *Decl = RT->getDecl();
if (!Decl->isStruct())
return false;
return Decl->hasFlexibleArrayMember();
}

bool Type::isObjCBoxableRecordType() const {
if (const auto *RT = getAs<RecordType>())
return RT->getDecl()->hasAttr<ObjCBoxableAttr>();
Expand Down
106 changes: 99 additions & 7 deletions clang/lib/Parse/ParseDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3312,6 +3312,19 @@ void Parser::ParseAlignmentSpecifier(ParsedAttributes &Attrs,
}
}

void Parser::DistributeCLateParsedAttrs(Decl *Dcl,
LateParsedAttrList *LateAttrs) {
if (!LateAttrs)
return;

if (Dcl) {
for (auto *LateAttr : *LateAttrs) {
if (LateAttr->Decls.empty())
LateAttr->addDecl(Dcl);
}
}
}

/// Bounds attributes (e.g., counted_by):
/// AttrName '(' expression ')'
void Parser::ParseBoundsAttribute(IdentifierInfo &AttrName,
Expand Down Expand Up @@ -4849,13 +4862,14 @@ static void DiagnoseCountAttributedTypeInUnnamedAnon(ParsingDeclSpec &DS,
///
void Parser::ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback) {
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs) {

if (Tok.is(tok::kw___extension__)) {
// __extension__ silences extension warnings in the subexpression.
ExtensionRAIIObject O(Diags); // Use RAII to do this.
ConsumeToken();
return ParseStructDeclaration(DS, FieldsCallback);
return ParseStructDeclaration(DS, FieldsCallback, LateFieldAttrs);
}

// Parse leading attributes.
Expand Down Expand Up @@ -4920,10 +4934,12 @@ void Parser::ParseStructDeclaration(
}

// If attributes exist after the declarator, parse them.
MaybeParseGNUAttributes(DeclaratorInfo.D);
MaybeParseGNUAttributes(DeclaratorInfo.D, LateFieldAttrs);

// We're done with this declarator; invoke the callback.
FieldsCallback(DeclaratorInfo);
Decl *Field = FieldsCallback(DeclaratorInfo);
if (Field)
DistributeCLateParsedAttrs(Field, LateFieldAttrs);

// If we don't have a comma, it is either the end of the list (a ';')
// or an error, bail out.
Expand All @@ -4934,6 +4950,73 @@ void Parser::ParseStructDeclaration(
}
}

// TODO: All callers of this function should be moved to
// `Parser::ParseLexedAttributeList`.
void Parser::ParseLexedCAttributeList(LateParsedAttrList &LAs, bool EnterScope,
ParsedAttributes *OutAttrs) {
assert(LAs.parseSoon() &&
"Attribute list should be marked for immediate parsing.");
for (auto *LA : LAs) {
ParseLexedCAttribute(*LA, EnterScope, OutAttrs);
delete LA;
}
LAs.clear();
}

/// Finish parsing an attribute for which parsing was delayed.
/// This will be called at the end of parsing a class declaration
/// for each LateParsedAttribute. We consume the saved tokens and
/// create an attribute with the arguments filled in. We add this
/// to the Attribute list for the decl.
void Parser::ParseLexedCAttribute(LateParsedAttribute &LA, bool EnterScope,
ParsedAttributes *OutAttrs) {
// Create a fake EOF so that attribute parsing won't go off the end of the
// attribute.
Token AttrEnd;
AttrEnd.startToken();
AttrEnd.setKind(tok::eof);
AttrEnd.setLocation(Tok.getLocation());
AttrEnd.setEofData(LA.Toks.data());
LA.Toks.push_back(AttrEnd);

// Append the current token at the end of the new token stream so that it
// doesn't get lost.
LA.Toks.push_back(Tok);
PP.EnterTokenStream(LA.Toks, /*DisableMacroExpansion=*/true,
/*IsReinject=*/true);
// Drop the current token and bring the first cached one. It's the same token
// as when we entered this function.
ConsumeAnyToken(/*ConsumeCodeCompletionTok=*/true);

// TODO: Use `EnterScope`
(void)EnterScope;

ParsedAttributes Attrs(AttrFactory);

assert(LA.Decls.size() <= 1 &&
"late field attribute expects to have at most one declaration.");

// Dispatch based on the attribute and parse it
ParseGNUAttributeArgs(&LA.AttrName, LA.AttrNameLoc, Attrs, nullptr, nullptr,
SourceLocation(), ParsedAttr::Form::GNU(), nullptr);

for (auto *D : LA.Decls)
Actions.ActOnFinishDelayedAttribute(getCurScope(), D, Attrs);

// Due to a parsing error, we either went over the cached tokens or
// there are still cached tokens left, so we skip the leftover tokens.
while (Tok.isNot(tok::eof))
ConsumeAnyToken();

// Consume the fake EOF token if it's there
if (Tok.is(tok::eof) && Tok.getEofData() == AttrEnd.getEofData())
ConsumeAnyToken();

if (OutAttrs) {
OutAttrs->takeAllFrom(Attrs);
}
}

/// ParseStructUnionBody
/// struct-contents:
/// struct-declaration-list
Expand All @@ -4957,6 +5040,11 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
ParseScope StructScope(this, Scope::ClassScope|Scope::DeclScope);
Actions.ActOnTagStartDefinition(getCurScope(), TagDecl);

// `LateAttrParseExperimentalExtOnly=true` requests that only attributes
// marked with `LateAttrParseExperimentalExt` are late parsed.
LateParsedAttrList LateFieldAttrs(/*PSoon=*/true,
/*LateAttrParseExperimentalExtOnly=*/true);

// While we still have something to read, read the declarations in the struct.
while (!tryParseMisplacedModuleImport() && Tok.isNot(tok::r_brace) &&
Tok.isNot(tok::eof)) {
Expand Down Expand Up @@ -5007,18 +5095,19 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
}

if (!Tok.is(tok::at)) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
// Install the declarator into the current TagDecl.
Decl *Field =
Actions.ActOnField(getCurScope(), TagDecl,
FD.D.getDeclSpec().getSourceRange().getBegin(),
FD.D, FD.BitfieldSize);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
ParsingDeclSpec DS(*this);
ParseStructDeclaration(DS, CFieldCallback);
ParseStructDeclaration(DS, CFieldCallback, &LateFieldAttrs);
} else { // Handle @defs
ConsumeToken();
if (!Tok.isObjCAtKeyword(tok::objc_defs)) {
Expand Down Expand Up @@ -5059,7 +5148,10 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,

ParsedAttributes attrs(AttrFactory);
// If attributes exist after struct contents, parse them.
MaybeParseGNUAttributes(attrs);
MaybeParseGNUAttributes(attrs, &LateFieldAttrs);

// Late parse field attributes if necessary.
ParseLexedCAttributeList(LateFieldAttrs, /*EnterScope=*/false);

SmallVector<Decl *, 32> FieldDecls(TagDecl->fields());

Expand Down
10 changes: 6 additions & 4 deletions clang/lib/Parse/ParseObjc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -780,16 +780,16 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
}

bool addedToDeclSpec = false;
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
if (FD.D.getIdentifier() == nullptr) {
Diag(AtLoc, diag::err_objc_property_requires_field_name)
<< FD.D.getSourceRange();
return;
return nullptr;
}
if (FD.BitfieldSize) {
Diag(AtLoc, diag::err_objc_property_bitfield)
<< FD.D.getSourceRange();
return;
return nullptr;
}

// Map a nullability property attribute to a context-sensitive keyword
Expand Down Expand Up @@ -818,6 +818,7 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
MethodImplKind);

FD.complete(Property);
return Property;
};

// Parse all the comma separated declarators.
Expand Down Expand Up @@ -2013,7 +2014,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
continue;
}

auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
assert(getObjCDeclContext() == interfaceDecl &&
"Ivar should have interfaceDecl as its decl context");
// Install the declarator into the interface decl.
Expand All @@ -2024,6 +2025,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
if (Field)
AllIvarDecls.push_back(Field);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
Expand Down
Loading

0 comments on commit 2918973

Please sign in to comment.