Skip to content

[clang-doc] add a JSON generator #142483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

evelez7
Copy link
Member

@evelez7 evelez7 commented Jun 2, 2025

Adds a JSON generator backend to emit mapped information as JSON. This will enable a better testing format for upcoming changes. It can also potentially serve to feed our other backend generators in the future, like Mustache which already serializes information to JSON before emitting as HTML.

This patch contains functionality to emit classes and provides most of the basis of the generator.

Copy link
Member Author

evelez7 commented Jun 2, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@evelez7 evelez7 requested review from petrhosek and ilovepi June 2, 2025 20:51
Copy link
Member Author

evelez7 commented Jun 2, 2025

I think this patch is mostly ready in terms of functionality, but we should decide if the emitted files should follow the HTML or YAML style of creation. Right now, since I just ripped the Mustache generator code, it creates folders for each namespace and emits the nested entities there. This actually represents a problem for template specializations because the code tries to write another JSON object (the specialization) to the same file as the base template, which is invalid.

With YAML, each entity is emitted in a file that uses the entity's USR as a name. All files are emitted to the output directory without any folders. That conveniently solves the above template problem, but also results in ugly file names that are potentially bad for multi-file testing (right now, we use regex to find a USR file name to test YAML).

I'd say the HTML layout is nice, but then we'd need to decide what to call the specializations' files or whether to just include them as an object inside the base template's file.

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch from 497637b to cf68952 Compare June 3, 2025 04:37
@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch from cf68952 to fa8b80f Compare June 3, 2025 16:43
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, but I thin we'll want some more tests, and probably some unit test coverage. Unittesting is especially nice, since I believe this backend doesn't need any of the asset files, right?

As for whether this should follow the pattern of YAML or HTML ... I'm conflicted for the same reasons you are. But maybe there's a better way for us to generate the filename that doesn't need the USR? like could we get the mangled name and use that? I'd hope those would be different enough to not conflict. Alternatively, maybe we should look at a way we can merge all the related docinfo together at an earlier step.

return LocationObj;
SmallString<128> FileURL(*RepositoryUrl);
sys::path::append(FileURL, sys::path::Style::posix, Loc.Filename);
FileURL += "#" + std::to_string(Loc.StartLineNumber);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a mistake in the mustache impl, since I believe we normally allow the user to set the the line numbering schema in the existing other generators.

Comment on lines +47 to +50
assert((Comment.Kind == "BlockCommandComment" ||
Comment.Kind == "FullComment" || Comment.Kind == "ParagraphComment" ||
Comment.Kind == "TextComment") &&
"Unknown Comment type in CommentInfo.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole bit may be nicer after #142273

@evelez7
Copy link
Member Author

evelez7 commented Jun 3, 2025

This is a good start, but I thin we'll want some more tests, and probably some unit test coverage. Unittesting is especially nice, since I believe this backend doesn't need any of the asset files, right?

Yeah, this doesn't need assets, so I'll be able to call things directly for unit tests.

As for whether this should follow the pattern of YAML or HTML ... I'm conflicted for the same reasons you are. But maybe there's a better way for us to generate the filename that doesn't need the USR? like could we get the mangled name and use that? I'd hope those would be different enough to not conflict. Alternatively, maybe we should look at a way we can merge all the related docinfo together at an earlier step.

I'll investigate this. Although right off the bat, the names in the Infos aren't mangled. Name and FullName are the same for both records, which is unfortunate.

@ilovepi
Copy link
Contributor

ilovepi commented Jun 3, 2025

I'll investigate this. Although right off the bat, the names in the Infos aren't mangled. Name and FullName are the same for both records, which is unfortunate.

OK, well, let's give it a try, and if its too hard lets ... IDK go w/ the YAML thing, so its correct? Testing will be kind of hellish so I'm loathe to go that path, but if one is obviously broken, then we should probably avoid it.

@ilovepi
Copy link
Contributor

ilovepi commented Jun 3, 2025

Right now, since I just ripped the Mustache generator code, it creates folders for each namespace and emits the nested entities there. This actually represents a problem for template specializations because the code tries to write another JSON object (the specialization) to the same file as the base template, which is invalid.

After re-reading this, I have a vague memory that this used to happen for everything, and it somehow got fixed for YAML. Probably back when Brett was working on this.

@evelez7
Copy link
Member Author

evelez7 commented Jun 4, 2025

If we're open to adding a flag in the base Info like IsClassSpecialization, then we can probably easily deal with these by trying to reconstruct the specialization's arguments (will be something like "Foo<T, int>.json"). Functions don't produce their own files so function specializations wont be a problem. Then we can keep this layout. @ilovepi

That or we manually change the name when serializing during mapping. I'm not so sure how well merging the infos would be ergonomically. Specializations can have all of their own unique members and methods which need documentation.

I also found a way to get a nice ugly mangled name :)

@ilovepi
Copy link
Contributor

ilovepi commented Jun 4, 2025

If we're open to adding a flag in the base Info like IsClassSpecialization, then we can probably easily deal with these by trying to reconstruct the specialization's arguments (will be something like "Foo<T, int>.json"). Functions don't produce their own files so function specializations wont be a problem. Then we can keep this layout. @ilovepi

That or we manually change the name when serializing during mapping. I'm not so sure how well merging the infos would be ergonomically. Specializations can have all of their own unique members and methods which need documentation.

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

I also found a way to get a nice ugly mangled name :)

Nice! IIRC that's what I've seen a lot of document generators use. If its unique enough for ODR, is probably good enough for docs ... right?

@evelez7
Copy link
Member Author

evelez7 commented Jun 5, 2025

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

As far as I can tell, inside the AST Clang makes use of the isa<> mechanisms (which we could also leverage) or pointer unions like

llvm::PointerUnion<ClassTemplateDecl *, MemberSpecializationInfo *>
TemplateOrInstantiation;

@ilovepi
Copy link
Contributor

ilovepi commented Jun 5, 2025

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

As far as I can tell, inside the AST Clang makes use of the isa<> mechanisms (which we could also leverage) or pointer unions like

llvm::PointerUnion<ClassTemplateDecl *, MemberSpecializationInfo *>
TemplateOrInstantiation;

Lets keep it simple for now. Clang-Doc doesn't have LLVM style RTTI yet, and I'm cautiously optimistic that we can keep it that way. So lets just add a bool or some bitfields or whatever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants