-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[WIP][lldb] Use forward decls with redeclared definitions instead of minimal import for records #95100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
You can test this locally with the following command:darker --check --diff -r 6258b5f610d51d37a79456d660b12c2d8e98500b...caacb57a46f34bf663fa5ab2190b361ce29b255b lldb/test/API/lang/cpp/reference-to-outer-type/TestCppReferenceToOuterClass.py View the diff from darker here.--- TestCppReferenceToOuterClass.py 2024-06-24 18:15:54.000000 +0000
+++ TestCppReferenceToOuterClass.py 2024-06-24 18:35:42.173115 +0000
@@ -4,11 +4,13 @@
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil
class TestCase(TestBase):
- @expectedFailureAll(setting=('plugin.typesystem.clang.experimental-redecl-completion', 'false'))
+ @expectedFailureAll(
+ setting=("plugin.typesystem.clang.experimental-redecl-completion", "false")
+ )
def test(self):
self.build()
self.dbg.CreateTarget(self.getBuildArtifact("a.out"))
test_var = self.expect_expr("test_var", result_type="In")
nested_member = test_var.GetChildMemberWithName("NestedClassMember")
|
Thank you for working on this. I'm very interested in the results of this effort, as it appears I may end up dabbling into these parts of lldb in the near future. For now just a couple of quick questions (with hopefully not too long answers).
|
For example, attaching LLDB to Clang and evaluating a method of the Another source of overhead are cases where we try to find the definition DIE using a commonly used name such as If we restrict LLDB to only complete fields (but complete methods only as needed), we get much closer to the baseline. Though "lazily" completing methods has other consequences (some features in LLDB rely on all methods being available, e.g., auto-complete, etc.). Apart from that, there still seem to be some redundant global lookups for namespaces and functions that add to the overhead. Once we get closer to the baseline performance I can provide some more accurate benchmark results. Also, happy to hear other suggestions/concerns regarding this.
Good catch! I cherry-picked these commits from the
Good question, I've been having trouble getting small reproducers for this. We've had numerous workarounds around the ASTImporter since the original patch was proposed that addressed these types of issues. E.g., https://reviews.llvm.org/D71378 (which isn't quite the "definition but no fields" situation, but wouldn't be surprised if historically this could've happened in some corner cases of the import mechanism). A common issue I've seen in stacktraces in the past are situtations where we started an |
Ack. Thanks for explaining.
I'm not sure by what you meant by "completing methods" in this paragraph. Does that mean "adding the methods (
Might I ask why? Was it too unwieldy? I can certainly see how a change like this would be hard to keep under a flag, but on the other hand, a risky/wide-sweeping change like this is where a flag would be most useful (e.g., it would enable incremental implementation&bugfixes instead of revert-reapply Yoyo-ing)
Oh yes, I remember that one. You don't have to go out of your way to find these, but I would like to hear about them if you run into any. |
Sorry I was a bit too loose in my terminology. You're correct, LLDB currently adds all methods to the containing class, and it's pretty cheap, because we don't fully complete arguments/return types. With this patch it becomes expensive when the ASTImporter gets involved. The core idea here is that we no longer do minimal import for records. So when we get to importing a class, the ASTImporter imports all the methods that LLDB attached, which in turn import all the parameters and return types, fully. That's where type explosion comes in.
True, and AFAICT the ASTImporter has no notion of that atm (I'd need to confirm this again), so I played around with the idea of not adding all methods to the containing class, and instead add it when the Clang parser asks us (which is tricky to do well because we usually don't know that we're parsing a method call). Another idea was to add more control over which types the ASTImporter imports fully, though I never got a great implementation going for that.
It isn't too bad, but having two implementations of this mechanism seemed like a maintenance burden (e.g., runs the risk of not being removed in the long run, need to run the test-suite with and without the setting, etc.). If the community is open to it, I don't mind. Incremental bugfixing/improvements is a big plus. I guess it might be easier to decide on this once we ironed out the remaining TODOs. For reference, here's what it looks like with the setting: swiftlang#8222 (approximately 50 references to |
I see. Yes, that sounds like a pretty big problem.
Unfortunately, I don't know much about this, but I think that something like this will be necessary, as importing ~everything recursively does not seem feasible. A possibly silly question, but my understanding was that this effort will better align us with the interfaces that are used within clang (for compiling modules I think). Is that correct? And if so, does it mean that clang does this explosive import whenever it imports a type from a (precompiled) module?
Yes, that's a risk I am acutely aware of, and I don't really have satisfying answer to that. FWIW, we do have some history of "successfully" removing parts of lldb that were a maintenance burden (from lldb-mi to Go support), so I think we could pull that off if there is someone who is sufficiently motivated (impacted) by this.
Yeah, let's do that. |
2d90c9f
to
adc2c6e
Compare
…ypes in GetCompleteQualType This patch factors out the completion logic for individual clang::Type's into their own helper functions. During the process I cleaned up a few assumptions (e.g., unnecessary if-guards that could be asserts because these conditions are guaranteed by the `clang::Type::TypeClass` switch in `GetCompleteQualType`). This is mainly motivated by the type-completion rework proposed in llvm#95100.
Yeah, putting this behind a very-explicitly-temporary flag seems reasonable to me, FWIW. & yeah, the analogy to modules re: member function expansion seems relevant as far as I understand things too. |
…types out of GetCompleteQualType (#95402) This patch factors out the completion logic for individual clang::Type's into their own helper functions. During the process I cleaned up a few assumptions (e.g., unnecessary if-guards that could be asserts because these conditions are guaranteed by the `clang::Type::TypeClass` switch in `GetCompleteQualType`). This is mainly motivated by the type-completion rework proposed in #95100.
adc2c6e
to
158c608
Compare
Regarding Clang interfaces, one of the main benefits of this patch is that we can simplify LLDB's interaction with the ASTImporter and also align us with the more tested/robust way of completing types in Clang. E.g., calls to llvm-project/clang/lib/Sema/SemaType.cpp Lines 9078 to 9081 in ad702e0
The
Good question, I haven't dug into the modules import mechanism yet, but if this is true, then that could give us an opportunity to introduce a "lazy method loading" mechanism on the ExternalASTSource that would fit both LLDB and modules? Will try to confirm this (unless someone else knows the answer to this already, e.g., @dwblaikie). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very excited to see this happening!
void CompleteRedeclChain(const clang::Decl *D) override { | ||
for (size_t i = 0; i < Sources.size(); ++i) | ||
for (size_t i = 0; i < Sources.size(); ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we're here: for (auto source : Sources)
?
@@ -228,6 +246,9 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser { | |||
void LinkDeclToDIE(clang::Decl *decl, | |||
const lldb_private::plugin::dwarf::DWARFDIE &die); | |||
|
|||
void RegisterDIE(lldb_private::plugin::dwarf::DWARFDebugInfoEntry *die, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a doxygen comment for this (or an override keyword)?
Yeah, I don't have any great info here - only the vague assumption/understanding that this path was more modules-aligned, and so perf improvements for lldb could also be perf improvements (& test coverage) for modules... - but I could be wrong on that conception. |
…for LLDB ExternalASTSources that use the ASTImporter This gives all ExternalASTSource classes a common base class that allows other classes (in LLDB's case ClangASTImporter) to invalidate the redeclaration chains in an ASTContext. The implementation just calls the (private) function to bump the generation counter which is how we track whether a redeclaration chain is out of date. This is used in D101950 to let the ClangASTImporter mark redeclaration chains as out-of-date so that they update themselves via the ExternalASTSource. Patch itself is NFC as it just changes the inheritance of a bunch of classes and changes the LLVM RTTI to make sure we can reliably cast to it. Will be merged with its dependent patch that lets the ClangASTImporter handle redeclaration chains.
… terms of CompleteTagDecl
… ASTImporterDelegate::Imported
…finition into source file
This patch is a part of D101950 which makes LLDB use redeclarations in its internal clang ASTs. This patch does the following: Add an API that allows redeclaring a declaration inside a TypeSystemClang. Make the TypeSystemClang APIs work with redeclarations. Mainly making sure we look at redeclarations of DeclContext (see the added calls to getPrimaryContext which for example finds the definition of a class. See also the test for nested types) Removes code that assumes a 1-1 mapping between Type and Decl (which is no longer true for redeclarations). The biggest offender here are the APIs for creating classes/templates/etc. as they return all CompilerType (which doesn't identify a unique Decl, but only a redeclaration chain). Removed code that tries to eagerly create types for declarations. Types should be created via GetTypeForDecl so that existing types from forward declarations are reused. Otherwise we end up with The patch also makes that we store the TemplateParameterInfos for template specializations in the TypeSystemClang. The only reason for this is that Clang is using a different data structure for storing this info internally so we can't just reuse the information from the forward declaration when creating the redeclaration (see CreateRedeclaration). This can be removed once we updated the DWARF parser and the TypeSystemClang APIs to accept the information as we can find it in the Clang AST. This shouldn't affect LLDB's functionality on its own as the DWARF parser currently doesn't create redeclarations. That's why all the tests are unit tests. The added logic for looking for the right redeclaration should work as before in ASTs that only have single-decl redeclaration chains (which is what we currently do).
…e find definition
…minimal import for records This patch is rewriting the way we move Clang types between different ASTS in Clang. The motivation is that the current approach for 'completing types' in LLDB just doesn't work. We have all kinds of situations where we either have Clang crash due to for example having a Record without DefinitionData but with FieldDecls in it. The reason for that is that at the moment we create record types ([CXX]RecordDecls and ObjCInterfaceDecls) and we always pretend that a type potentially has a definition that somehow could be lazily pulled in. However, Clang doesn't have any interface (at least none that is consistently called) that turns a RecordDecl without DefinitionData into a RecordDecl with DefinitionData. The only relevant API in the ExternalASTSource is CompleteType is suffering from the fact that it's essentially not used by generic Clang code. The part of the ExternalASTSource API that is consistently called to pull in a potential definition is CompleteRedeclChain (which is for example automatically called when calling getDefinition on a RecordDecl). The problem with CompleteRedeclChain however is that it's not supposed to add definition data to the Decl its called on, but instead it should provide a redeclaration that is defining the record. That's a very different system than what we currently have and we can't just swap it out under the hood. To make it short: We probably need to rewrite that part of LLDB. So the solution here is essentially rewriting all our completion code to do this: * Instead of creating these weirdly defined records that have fields but maybe not definition and so on, we only create forward decls at the start. * We then bump the GenerationCounter of the ExternalASTSource of the current AST to force rebuilding of the redeclaration chain. * When Clang asks for the definition of the record, it will ask the ExternalASTSource to complete the redeclaration chain. At this point we build the definition and add it as a second decl. The ASTImporter can now also stop using the minimal mode for records as there is no reason anymore. It just imports first the forward declaration and then the definition when needed.
1970769
to
caacb57
Compare
This patch changes the return type of `GetMetadata` from a `ClangASTMetadata*` to a `std::optional<ClangASTMetadata>`. Except for one call-site (`SetDeclIsForcefullyCompleted`), we never actually make use of the mutability of the returned metadata. And we never make use of the pointer-identity. By passing `ClangASTMetadata` by-value (the type is fairly small, size of 2 64-bit pointers) we'll avoid some questions surrounding the lifetimes/ownership/mutability of this metadata. For consistency, we also change the parameter to `SetMetadata` from `ClangASTMetadata&` to `ClangASTMetadata` (which is an NFC since we copy the data anyway). This came up during some changes we plan to make where we [create redeclaration chains for decls in the LLDB AST](#95100). We want to avoid having to dig out the canonical decl of the declaration chain for retrieving/setting the metadata. It should just be copied across all decls in the chain. This is easier to guarantee when everything is done by-value.
…types out of GetCompleteQualType (llvm#95402) This patch factors out the completion logic for individual clang::Type's into their own helper functions. During the process I cleaned up a few assumptions (e.g., unnecessary if-guards that could be asserts because these conditions are guaranteed by the `clang::Type::TypeClass` switch in `GetCompleteQualType`). This is mainly motivated by the type-completion rework proposed in llvm#95100.
This begins our upstreaming efforts for the changes described in following RFC: https://discourse.llvm.org/t/rfc-lldb-more-reliable-completion-of-record-types/77442
This is a combined patch with all the changes, but the actual implementation that's of interest for reviewers is in the last commit 2d90c9f of this patch (the rest is "mostly" NFC). The main caveat currently is that the more aggressive completion of definitions causes the expression evaluator performance to noticeably regress (mainly because we're doing many more DWARF searches than we used to). We're currently in the process of looking for ways to improve this.
Motivation
The current approach for 'completing types' in LLDB is brittle because we're constructing clang Decls and filling out their definitions in ways in which they weren't designed for. This frequently manifests in crashes where Clang expected a
DefinitionData
to exist on a Decl and thus unconditionally dereferences it, but LLDB didn't fill it out (i.e., complete it) in time. The purpose of this patch is to ensure we allocate a definition for each decl that LLDB creates by the time that Clang needs it.Details
At the moment we create record types (
[CXX]RecordDecls
andObjCInterfaceDecls
) and we always pretend that a type potentially has a definition that eventually would be lazily pulled in. However, Clang doesn't have any interface (at least none that is consistently called) that turns aRecordDecl
withoutDefinitionData
into aRecordDecl
withDefinitionData
. The only relevant API in theExternalASTSource
isCompleteType
, which suffers from the fact that it's essentially not used by generic Clang code.The part of the ExternalASTSource API that is consistently called to pull in a potential definition is
CompleteRedeclChain
(which unconditionally gets invoked when callinggetDefinition
on aRecordDecl
).The approach is as follows:
CurrentGeneration
counter of the current ASTsExternalASTSource
to force rebuilding of the redeclaration chain.getDefinition()
), it will ask theExternalASTSource
to complete the redeclaration chain. At this point we build the definition and add it as a second decl.This means the
ASTImporter
can now also stop using theMinimal
mode for records as there is no reason anymore. It just imports first the forward declaration and then the definition when needed.