Skip to content

Commit

Permalink
[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in…
Browse files Browse the repository at this point in the history
… C (#90786)

Previously the attribute was only allowed on flexible array members.
This patch patch changes this to also allow the attribute on pointer
fields in structs and also allows late parsing of the attribute in some
contexts.

For example this previously wasn't allowed:

```
struct BufferTypeDeclAttributePosition {
  size_t count;
  char* buffer __counted_by(count); // Now allowed
}
```

Note the attribute is prevented on pointee types where the size isn't
known at compile time. In particular pointee types that are:

* Incomplete (e.g. `void`) and sizeless types
* Function types (e.g. the pointee of a function pointer)
* Struct types with a flexible array member

This patch also introduces late parsing of the attribute when used in
the declaration attribute position. For example

```
struct BufferTypeDeclAttributePosition {
  char* buffer __counted_by(count); // Now allowed
  size_t count;
}
```

is now allowed but **only** when passing
`-fexperimental-late-parse-attributes`. The motivation for using late
parsing here is to avoid breaking the data layout of structs in existing
code that want to use the `counted_by` attribute. This patch is the
first use of `LateAttrParseExperimentalExt` in `Attr.td` that was
introduced in a previous patch.

Note by allowing the attribute on struct member pointers this now allows
the possiblity of writing the attribute in the type attribute position.
For example:

```
struct BufferTypeAttributePosition {
  size_t count;
  char *__counted_by(count) buffer; // Now allowed
}
```

However, the attribute in this position is still currently parsed
immediately rather than late parsed. So this will not parse currently:

```
struct BufferTypeAttributePosition {
  char *__counted_by(count) buffer; // Fails to parse
  size_t count;
}
```

The intention is to lift this restriction in future patches. It has not
been done in this patch to keep this size of this commit small.

There are also several other follow up changes that will need to be
addressed in future patches:

* Make late parsing working with anonymous structs (see
`on_pointer_anon_buf` in `attr-counted-by-late-parsed-struct-ptrs.c`).
* Allow `counted_by` on more subjects (e.g. parameters, returns types)
when `-fbounds-safety` is enabled.
* Make use of the attribute on pointer types in code gen (e.g. for
`_builtin_dynamic_object_size` and UBSan's array-bounds checks).

This work is heavily based on a patch originally written by Yeoul Na.

rdar://125400257

Co-authored-by: Dan Liew <dan@su-root.co.uk>
  • Loading branch information
delcypher and delcypher authored May 17, 2024
1 parent 1e5f29a commit 0ec3b97
Show file tree
Hide file tree
Showing 21 changed files with 1,117 additions and 148 deletions.
21 changes: 20 additions & 1 deletion clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,8 @@ New Compiler Flags

- ``-fexperimental-late-parse-attributes`` enables an experimental feature to
allow late parsing certain attributes in specific contexts where they would
not normally be late parsed.
not normally be late parsed. Currently this allows late parsing the
`counted_by` attribute in C. See `Attribute Changes in Clang`_.

- ``-fseparate-named-sections`` uses separate unique sections for global
symbols in named special sections (i.e. symbols annotated with
Expand Down Expand Up @@ -406,6 +407,24 @@ Attribute Changes in Clang
- The ``clspv_libclc_builtin`` attribute has been added to allow clspv
(`OpenCL-C to Vulkan SPIR-V compiler <https://github.com/google/clspv>`_) to identify functions coming from libclc
(`OpenCL-C builtin library <https://libclc.llvm.org>`_).
- The ``counted_by`` attribute is now allowed on pointers that are members of a
struct in C.

- The ``counted_by`` attribute can now be late parsed in C when
``-fexperimental-late-parse-attributes`` is passed but only when attribute is
used in the declaration attribute position. This allows using the
attribute on existing code where it previously impossible to do so without
re-ordering struct field declarations would break ABI as shown below.

.. code-block:: c
struct BufferTy {
/* Refering to `count` requires late parsing */
char* buffer __counted_by(count);
/* Swapping `buffer` and `count` to avoid late parsing would break ABI */
size_t count;
};
Improvements to Clang's diagnostics
-----------------------------------
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/AST/Type.h
Original file line number Diff line number Diff line change
Expand Up @@ -2515,6 +2515,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
bool isRecordType() const;
bool isClassType() const;
bool isStructureType() const;
bool isStructureTypeWithFlexibleArrayMember() const;
bool isObjCBoxableRecordType() const;
bool isInterfaceType() const;
bool isStructureOrClassType() const;
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -2229,7 +2229,8 @@ def TypeNullUnspecified : TypeAttr {
def CountedBy : DeclOrTypeAttr {
let Spellings = [Clang<"counted_by">];
let Subjects = SubjectList<[Field], ErrorDiag>;
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel">];
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel", 1>];
let LateParsed = LateAttrParseExperimentalExt;
let ParseArgumentsAsUnevaluated = 1;
let Documentation = [CountedByDocs];
let LangOpts = [COnly];
Expand Down
17 changes: 15 additions & 2 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -6533,8 +6533,10 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<

def err_flexible_array_count_not_in_same_struct : Error<
"'counted_by' field %0 isn't within the same struct as the flexible array">;
def err_counted_by_attr_not_on_flexible_array_member : Error<
"'counted_by' only applies to C99 flexible array members">;
def err_counted_by_attr_not_on_ptr_or_flexible_array_member : Error<
"'counted_by' only applies to pointers or C99 flexible array members">;
def err_counted_by_attr_on_array_not_flexible_array_member : Error<
"'counted_by' on arrays only applies to C99 flexible array members">;
def err_counted_by_attr_refer_to_itself : Error<
"'counted_by' cannot refer to the flexible array member %0">;
def err_counted_by_must_be_in_structure : Error<
Expand All @@ -6549,6 +6551,17 @@ def err_counted_by_attr_refer_to_union : Error<
"'counted_by' argument cannot refer to a union member">;
def note_flexible_array_counted_by_attr_field : Note<
"field %0 declared here">;
def err_counted_by_attr_pointee_unknown_size : Error<
"'counted_by' cannot be applied to %select{"
"a pointer with pointee|" // pointer
"an array with element}0" // array
" of unknown size because %1 is %select{"
"an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE
"a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS
"a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION
// CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER
"a struct type with a flexible array member"
"}2">;

let CategoryName = "ARC Semantic Issue" in {

Expand Down
7 changes: 6 additions & 1 deletion clang/include/clang/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -1645,6 +1645,8 @@ class Parser : public CodeCompletionHandler {
bool EnterScope, bool OnDefinition);
void ParseLexedAttribute(LateParsedAttribute &LA,
bool EnterScope, bool OnDefinition);
void ParseLexedCAttribute(LateParsedAttribute &LA,
ParsedAttributes *OutAttrs = nullptr);
void ParseLexedMethodDeclarations(ParsingClass &Class);
void ParseLexedMethodDeclaration(LateParsedMethodDeclaration &LM);
void ParseLexedMethodDefs(ParsingClass &Class);
Expand Down Expand Up @@ -2531,7 +2533,8 @@ class Parser : public CodeCompletionHandler {

void ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback);
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs = nullptr);

DeclGroupPtrTy ParseTopLevelStmtDecl();

Expand Down Expand Up @@ -3109,6 +3112,8 @@ class Parser : public CodeCompletionHandler {
SourceLocation ScopeLoc,
ParsedAttr::Form Form);

void DistributeCLateParsedAttrs(Decl *Dcl, LateParsedAttrList *LateAttrs);

void ParseBoundsAttribute(IdentifierInfo &AttrName,
SourceLocation AttrNameLoc, ParsedAttributes &Attrs,
IdentifierInfo *ScopeName, SourceLocation ScopeLoc,
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -11396,7 +11396,8 @@ class Sema final : public SemaBase {
QualType BuildMatrixType(QualType T, Expr *NumRows, Expr *NumColumns,
SourceLocation AttrLoc);

QualType BuildCountAttributedArrayType(QualType WrappedTy, Expr *CountExpr);
QualType BuildCountAttributedArrayOrPointerType(QualType WrappedTy,
Expr *CountExpr);

QualType BuildAddressSpaceAttr(QualType &T, LangAS ASIdx, Expr *AddrSpace,
SourceLocation AttrLoc);
Expand Down
10 changes: 10 additions & 0 deletions clang/lib/AST/Type.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,16 @@ bool Type::isStructureType() const {
return false;
}

bool Type::isStructureTypeWithFlexibleArrayMember() const {
const auto *RT = getAs<RecordType>();
if (!RT)
return false;
const auto *Decl = RT->getDecl();
if (!Decl->isStruct())
return false;
return Decl->hasFlexibleArrayMember();
}

bool Type::isObjCBoxableRecordType() const {
if (const auto *RT = getAs<RecordType>())
return RT->getDecl()->hasAttr<ObjCBoxableAttr>();
Expand Down
104 changes: 97 additions & 7 deletions clang/lib/Parse/ParseDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3288,6 +3288,19 @@ void Parser::ParseAlignmentSpecifier(ParsedAttributes &Attrs,
}
}

void Parser::DistributeCLateParsedAttrs(Decl *Dcl,
LateParsedAttrList *LateAttrs) {
assert(Dcl && "Dcl cannot be null");

if (!LateAttrs)
return;

for (auto *LateAttr : *LateAttrs) {
if (LateAttr->Decls.empty())
LateAttr->addDecl(Dcl);
}
}

/// Bounds attributes (e.g., counted_by):
/// AttrName '(' expression ')'
void Parser::ParseBoundsAttribute(IdentifierInfo &AttrName,
Expand Down Expand Up @@ -4825,13 +4838,14 @@ static void DiagnoseCountAttributedTypeInUnnamedAnon(ParsingDeclSpec &DS,
///
void Parser::ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback) {
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs) {

if (Tok.is(tok::kw___extension__)) {
// __extension__ silences extension warnings in the subexpression.
ExtensionRAIIObject O(Diags); // Use RAII to do this.
ConsumeToken();
return ParseStructDeclaration(DS, FieldsCallback);
return ParseStructDeclaration(DS, FieldsCallback, LateFieldAttrs);
}

// Parse leading attributes.
Expand Down Expand Up @@ -4896,10 +4910,12 @@ void Parser::ParseStructDeclaration(
}

// If attributes exist after the declarator, parse them.
MaybeParseGNUAttributes(DeclaratorInfo.D);
MaybeParseGNUAttributes(DeclaratorInfo.D, LateFieldAttrs);

// We're done with this declarator; invoke the callback.
FieldsCallback(DeclaratorInfo);
Decl *Field = FieldsCallback(DeclaratorInfo);
if (Field)
DistributeCLateParsedAttrs(Field, LateFieldAttrs);

// If we don't have a comma, it is either the end of the list (a ';')
// or an error, bail out.
Expand All @@ -4910,6 +4926,69 @@ void Parser::ParseStructDeclaration(
}
}

/// Finish parsing an attribute for which parsing was delayed.
/// This will be called at the end of parsing a class declaration
/// for each LateParsedAttribute. We consume the saved tokens and
/// create an attribute with the arguments filled in. We add this
/// to the Attribute list for the decl.
void Parser::ParseLexedCAttribute(LateParsedAttribute &LA,
ParsedAttributes *OutAttrs) {
// Create a fake EOF so that attribute parsing won't go off the end of the
// attribute.
Token AttrEnd;
AttrEnd.startToken();
AttrEnd.setKind(tok::eof);
AttrEnd.setLocation(Tok.getLocation());
AttrEnd.setEofData(LA.Toks.data());
LA.Toks.push_back(AttrEnd);

// Append the current token at the end of the new token stream so that it
// doesn't get lost.
LA.Toks.push_back(Tok);
PP.EnterTokenStream(LA.Toks, /*DisableMacroExpansion=*/true,
/*IsReinject=*/true);
// Drop the current token and bring the first cached one. It's the same token
// as when we entered this function.
ConsumeAnyToken(/*ConsumeCodeCompletionTok=*/true);

ParsedAttributes Attrs(AttrFactory);

assert(LA.Decls.size() <= 1 &&
"late field attribute expects to have at most one declaration.");

// Dispatch based on the attribute and parse it
const AttributeCommonInfo::Form ParsedForm = ParsedAttr::Form::GNU();
IdentifierInfo *ScopeName = nullptr;
const ParsedAttr::Kind AttrKind =
ParsedAttr::getParsedKind(&LA.AttrName, /*ScopeName=*/ScopeName,
/*SyntaxUsed=*/ParsedForm.getSyntax());
switch (AttrKind) {
case ParsedAttr::Kind::AT_CountedBy:
ParseBoundsAttribute(LA.AttrName, LA.AttrNameLoc, Attrs,
/*ScopeName=*/ScopeName, SourceLocation(),
/*Form=*/ParsedForm);
break;
default:
llvm_unreachable("Unhandled late parsed attribute");
}

for (auto *D : LA.Decls)
Actions.ActOnFinishDelayedAttribute(getCurScope(), D, Attrs);

// Due to a parsing error, we either went over the cached tokens or
// there are still cached tokens left, so we skip the leftover tokens.
while (Tok.isNot(tok::eof))
ConsumeAnyToken();

// Consume the fake EOF token if it's there
if (Tok.is(tok::eof) && Tok.getEofData() == AttrEnd.getEofData())
ConsumeAnyToken();

if (OutAttrs) {
OutAttrs->takeAllFrom(Attrs);
}
}

/// ParseStructUnionBody
/// struct-contents:
/// struct-declaration-list
Expand All @@ -4933,6 +5012,11 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
ParseScope StructScope(this, Scope::ClassScope|Scope::DeclScope);
Actions.ActOnTagStartDefinition(getCurScope(), TagDecl);

// `LateAttrParseExperimentalExtOnly=true` requests that only attributes
// marked with `LateAttrParseExperimentalExt` are late parsed.
LateParsedAttrList LateFieldAttrs(/*PSoon=*/false,
/*LateAttrParseExperimentalExtOnly=*/true);

// While we still have something to read, read the declarations in the struct.
while (!tryParseMisplacedModuleImport() && Tok.isNot(tok::r_brace) &&
Tok.isNot(tok::eof)) {
Expand Down Expand Up @@ -4983,18 +5067,19 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
}

if (!Tok.is(tok::at)) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
// Install the declarator into the current TagDecl.
Decl *Field =
Actions.ActOnField(getCurScope(), TagDecl,
FD.D.getDeclSpec().getSourceRange().getBegin(),
FD.D, FD.BitfieldSize);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
ParsingDeclSpec DS(*this);
ParseStructDeclaration(DS, CFieldCallback);
ParseStructDeclaration(DS, CFieldCallback, &LateFieldAttrs);
} else { // Handle @defs
ConsumeToken();
if (!Tok.isObjCAtKeyword(tok::objc_defs)) {
Expand Down Expand Up @@ -5035,7 +5120,12 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,

ParsedAttributes attrs(AttrFactory);
// If attributes exist after struct contents, parse them.
MaybeParseGNUAttributes(attrs);
MaybeParseGNUAttributes(attrs, &LateFieldAttrs);

// Late parse field attributes if necessary.
assert(!getLangOpts().CPlusPlus);
for (auto *LateAttr : LateFieldAttrs)
ParseLexedCAttribute(*LateAttr);

SmallVector<Decl *, 32> FieldDecls(TagDecl->fields());

Expand Down
10 changes: 6 additions & 4 deletions clang/lib/Parse/ParseObjc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -780,16 +780,16 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
}

bool addedToDeclSpec = false;
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
if (FD.D.getIdentifier() == nullptr) {
Diag(AtLoc, diag::err_objc_property_requires_field_name)
<< FD.D.getSourceRange();
return;
return nullptr;
}
if (FD.BitfieldSize) {
Diag(AtLoc, diag::err_objc_property_bitfield)
<< FD.D.getSourceRange();
return;
return nullptr;
}

// Map a nullability property attribute to a context-sensitive keyword
Expand Down Expand Up @@ -818,6 +818,7 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
MethodImplKind);

FD.complete(Property);
return Property;
};

// Parse all the comma separated declarators.
Expand Down Expand Up @@ -2013,7 +2014,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
continue;
}

auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
assert(getObjCDeclContext() == interfaceDecl &&
"Ivar should have interfaceDecl as its decl context");
// Install the declarator into the interface decl.
Expand All @@ -2024,6 +2025,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
if (Field)
AllIvarDecls.push_back(Field);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
Expand Down
Loading

0 comments on commit 0ec3b97

Please sign in to comment.