Skip to content

Conversation

@mizvekov
Copy link
Contributor

@mizvekov mizvekov commented Jul 9, 2025

This is a major change on how we represent nested name qualifications in the AST.

  • The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy.
  • The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present.
  • TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for stdexec. For one datapoint, for test_on2.cpp in that project, which is the slowest compiling test, this patch improves -c compilation time by about 7.2%, with the -fsyntax-only improvement being at ~12%.

This has great results on compile-time-tracker as well:
image

This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in clang/include/clang/AST and clang/lib/AST, with also important changes in clang/lib/Sema/TreeTransform.h.

The rest and bulk of the changes are mostly consequences of the changes in API.

PS: TagType::getDecl is renamed to getOriginalDecl in this patch, just for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes #43179
Fixes #68670
Fixes #92757

mizvekov added a commit that referenced this pull request Oct 3, 2025
This fixes a regression introduced in #147835

When parsing a lambda where the call operator has a late parsed attribute,
we would try to build a 'this' type for the lambda, but in a lambda 'this'
never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the -ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in #147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug
where any previously created tag type for the lambda is dicarded after
its dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes #161657
mizvekov added a commit that referenced this pull request Oct 3, 2025
This fixes a regression introduced in #147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in #147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes #161657
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
…161765)

This fixes a regression introduced in llvm#147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in llvm#147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes llvm#161657
MixedMatched pushed a commit to MixedMatched/llvm-project that referenced this pull request Oct 3, 2025
…161765)

This fixes a regression introduced in llvm#147835

When parsing a lambda where the call operator has a late parsed
attribute, we would try to build a 'this' type for the lambda, but in a
lambda 'this' never refers to the lambda class itself.

This late parsed attribute can be added implicitly by the
-ftrapping-math flag.

This patch patch makes it so CXXThisScopeRAII ignores lambdas.

This became observable in llvm#147835 because that made clang lazily create
tag types, and it removed a workaround for a lambda dependency bug where
any previously created tag type for the lambda is discarded after its
dependency is recalculated.

But the 'this' scope created above would defeat this laziness and create
the lambda type too soon, before its dependency was updated.

Since this regression was never released, there are no release notes.

Fixes llvm#161657
@emaxx-google
Copy link
Contributor

Hello, did these changes modify the semantics of NestedNameSpecifierLoc::getLocalSourceRange()? For a code like void n::C::f() {}, previous versions of Clang returned C from getLocalSourceRange(), but now it's returning n::C. The new behavior seems to contradict the documentation of this method.

@mizvekov
Copy link
Contributor Author

mizvekov commented Oct 9, 2025

Hello, did these changes modify the semantics of NestedNameSpecifierLoc::getLocalSourceRange()? For a code like void n::C::f() {}, previous versions of Clang returned C from getLocalSourceRange(), but now it's returning n::C. The new behavior seems to contradict the documentation of this method.

I think the documentation is right and matches what's happening.
It's just that what's considered local has changed in the case of a NestedNameSpecifier which is a Type.
The type was previously split-up, each chunk being its own nested name specifier.
Now a Type is always a NestedNameSpecifier component by itself.

Sounds like you want to get the beginning of the range as something like:

if (auto TL = NNSL->getAsTypeLoc())
  return TL.getNonPrefixBeginLoc();
else
  return NNSL->getLocalBeginLoc();

I didn't add a helper for this because there was just a single user in the llvm codebase using this pattern.
See the clang-tools-extra/clangd/FindTarget.cpp change from this patch.

@hokein
Copy link
Collaborator

hokein commented Oct 9, 2025

I think the documentation is right and matches what's happening.
It's just that what's considered local has changed in the case of a NestedNameSpecifier which is a Type.
The type was previously split-up, each chunk being its own nested name specifier.
Now a Type is always a NestedNameSpecifier component by itself.

Just my two cents.

I think the term "local" is ambiguous, and my mental model has always been that it refers to the last specifier in a qualified name. This seems to have been the behavior, and the public API doc implies it.

This recent change seems to break that model and introduces behavior that feels inconsistent:

  • For namespace specifier ns1::ns2::, getLocalSourceRange returns the range of "ns2::";
  • For type specifier ns1::C1::, getLocalSourceRange returns the range of "ns1::C1::";

This difference is confusing. While I understand the reasoning behind the internal implementation, API users aren't likely to be aware of it, and the "last piece" model seems more intuitive.

If this new behavior is intended, could we update the comments for getLocalSourceRange() to clarify these new rules and provide examples for each case? or even would it make sense to preserve the old behavior for the sake of API consistency?

@mizvekov
Copy link
Contributor Author

Yeah, all except one user of getLocalSourceRange expected the behavior to be what it is now.

I propose adding a note about this in the documentation, add a new helper which does what getLocalSourceRange used to do, and migrate the single existing user to use that.

@emaxx-google
Copy link
Contributor

emaxx-google commented Oct 12, 2025

Hello, did these changes modify the semantics of NestedNameSpecifierLoc::getLocalSourceRange()? For a code like void n::C::f() {}, previous versions of Clang returned C from getLocalSourceRange(), but now it's returning n::C. The new behavior seems to contradict the documentation of this method.

I think the documentation is right and matches what's happening. It's just that what's considered local has changed in the case of a NestedNameSpecifier which is a Type. The type was previously split-up, each chunk being its own nested name specifier. Now a Type is always a NestedNameSpecifier component by itself.

Sounds like you want to get the beginning of the range as something like:

Thanks, this workaround works for me. It's just that such subtle changes of the API (the function interface wasn't changed, its documentation wasn't changed and changelog wasn't clear on this backwards-incompatible change) are very hard to pinpoint&address in downstream projects.

mizvekov added a commit that referenced this pull request Oct 13, 2025
This adds a note to the documentation of getLocalSourceRange,
and cleans up one of the users of this function which was
changed in #147835
mizvekov added a commit that referenced this pull request Oct 13, 2025
This adds a note to the documentation of getLocalSourceRange,
and cleans up one of the users of this function which was
changed in #147835

This also removes `TypeLoc::getNonPrefixBeginLoc`, which was recently
introduced in the above patch, as that became unused.
@mizvekov
Copy link
Contributor Author

After another look at this, we can remove that single potential user and that's a simplification over the status quo.

See #163206

Also removes TypeLoc::getNonPrefixBeginLoc, which becomes unused after this change.

@emaxx-google see if your problem can be solved with a change similar to the above patch, that would be better than the solution I posted earlier.

I added the note on this patch, but also see that the example in that documentation already reflected that change.

mizvekov added a commit that referenced this pull request Oct 13, 2025
This adds a note to the documentation of getLocalSourceRange,
and cleans up one of the users of this function which was
changed in #147835

This also removes `TypeLoc::getNonPrefixBeginLoc`, which was recently
introduced in the above patch, as that became unused.
mizvekov added a commit that referenced this pull request Oct 13, 2025
This adds a note to the documentation of getLocalSourceRange, and cleans
up one of the users of this function which was changed in
#147835

This also removes `TypeLoc::getNonPrefixBeginLoc`, which was recently
introduced in the above patch, as that became unused.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 13, 2025
This adds a note to the documentation of getLocalSourceRange, and cleans
up one of the users of this function which was changed in
llvm/llvm-project#147835

This also removes `TypeLoc::getNonPrefixBeginLoc`, which was recently
introduced in the above patch, as that became unused.
mizvekov added a commit that referenced this pull request Oct 13, 2025
This rename was made as part of #147835
in order to ease rebasing the PR, and give a nice window for other patches to
get rebased as well.

This has been a while already, so lets go ahead and rename it back.
mizvekov added a commit that referenced this pull request Oct 13, 2025
This rename was made as part of #147835
in order to ease rebasing the PR, and give a nice window for other patches to
get rebased as well.

This has been a while already, so lets go ahead and rename it back.
@mizvekov
Copy link
Contributor Author

mizvekov commented Oct 13, 2025

I just posted #163271 which will rename getOriginalDecl back to getDecl, as I think enough time has passed and there has been a big enough window to facilitate any rebases.

Let me know if anyone needs more time.

Ideally we would like to get this in before the release, so that upstream users that need to support multiple clang versions don't need a bunch of extra #if #elses.

philnik777 added a commit to philnik777/llvm-project that referenced this pull request Oct 14, 2025
philnik777 added a commit to philnik777/llvm-project that referenced this pull request Oct 14, 2025
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
This adds a note to the documentation of getLocalSourceRange, and cleans
up one of the users of this function which was changed in
llvm#147835

This also removes `TypeLoc::getNonPrefixBeginLoc`, which was recently
introduced in the above patch, as that became unused.
akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025
mizvekov added a commit that referenced this pull request Oct 15, 2025
This rename was made as part of #147835
in order to ease rebasing the PR, and give a nice window for other patches to
get rebased as well.

This has been a while already, so lets go ahead and rename it back.
mizvekov added a commit that referenced this pull request Oct 15, 2025
This rename was made as part of
#147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.

It has been a while already, so lets go ahead and rename it back.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 15, 2025
…cl (#163271)

This rename was made as part of
llvm/llvm-project#147835 in order to ease
rebasing the PR, and give a nice window for other patches to get rebased
as well.

It has been a while already, so lets go ahead and rename it back.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 backend:AMDGPU backend:ARC backend:ARM backend:CSKY backend:Hexagon backend:Lanai backend:loongarch backend:MIPS backend:PowerPC backend:RISC-V backend:Sparc backend:SystemZ backend:WebAssembly backend:X86 clang:analysis clang:as-a-library libclang and C++ API clang:bytecode Issues for the clang bytecode constexpr interpreter clang:codegen IR generation bugs: mangling, exceptions, etc. clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang:openmp OpenMP related changes to Clang clang:static analyzer clang Clang issues not falling into any other category clang-tidy clang-tools-extra clangd coroutines C++20 coroutines debuginfo HLSL HLSL Language Support libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. lldb

Projects

None yet