Skip to content

[Clang][Comments] Support for parsing headers in Doxygen \par commands #91100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 20, 2024

Conversation

hdoc
Copy link
Contributor

@hdoc hdoc commented May 5, 2024

Background

Doxygen's \par command (link) has an optional argument, which denotes the header of the paragraph started by a given \par command.

In short, the paragraph command can be used with a heading, or without one. The code block below shows both forms and how the current version of LLVM/Clang parses this code:

$ cat test.cpp
/// \par User defined paragraph:
/// Contents of the paragraph.
///
/// \par
/// New paragraph under the same heading.
///
/// \par
/// A second paragraph.
class A {};

$ clang++ -cc1 -ast-dump -fcolor-diagnostics -std=c++20 test.cpp
`-CXXRecordDecl 0x1530f3a78 <test.cpp:11:1, col:10> col:7 class A definition
  |-FullComment 0x1530fea38 <line:2:4, line:9:23>
  | |-ParagraphComment 0x1530fe7e0 <line:2:4>
  | | `-TextComment 0x1530fe7b8 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe800 <col:5, line:3:30> Name="par"
  | | `-ParagraphComment 0x1530fe878 <line:2:9, line:3:30>
  | |   |-TextComment 0x1530fe828 <line:2:9, col:32> Text=" User defined paragraph:"
  | |   `-TextComment 0x1530fe848 <line:3:4, col:30> Text=" Contents of the paragraph."
  | |-ParagraphComment 0x1530fe8c0 <line:5:4>
  | | `-TextComment 0x1530fe898 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe8e0 <col:5, line:6:41> Name="par"
  | | `-ParagraphComment 0x1530fe930 <col:4, col:41>
  | |   `-TextComment 0x1530fe908 <col:4, col:41> Text=" New paragraph under the same heading."
  | |-ParagraphComment 0x1530fe978 <line:8:4>
  | | `-TextComment 0x1530fe950 <col:4> Text=" "
  | `-BlockCommandComment 0x1530fe998 <col:5, line:9:23> Name="par"
  |   `-ParagraphComment 0x1530fe9e8 <col:4, col:23>
  |     `-TextComment 0x1530fe9c0 <col:4, col:23> Text=" A second paragraph."
  `-CXXRecordDecl 0x1530f3bb0 <line:11:1, col:7> col:7 implicit class A

As we can see above, the optional paragraph heading ("User defined paragraph") is not an argument of the \par BlockCommandComment, but instead a child TextComment.

For documentation generators like hdoc, it would be ideal if we could parse Doxygen documentation comments with these semantics in mind. Currently that's not possible.

Change

This change parses \par command according to how Doxygen parses them, making an optional header available as a an argument if it is present. In addition:

  • AST unit tests are defined to test this functionality when an argument is present, isn't present, with additional spacing, etc.
  • TableGen is updated with an IsParCommand to support this functionality
  • lit tests are updated where needed

Copy link

github-actions bot commented May 5, 2024

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 5, 2024
@llvmbot
Copy link
Member

llvmbot commented May 5, 2024

@llvm/pr-subscribers-clang

Author: hdoc (hdoc)

Changes

Background

Doxygen's \par command (link) has an optional argument, which denotes the header of the paragraph started by a given \par command.

In short, the paragraph command can be used with a heading, or without one. The code block below shows both forms and how the current version of LLVM/Clang parses this code:

$ cat test.cpp
/// \par User defined paragraph:
/// Contents of the paragraph.
///
/// \par
/// New paragraph under the same heading.
///
/// \par
/// A second paragraph.
class A {};
$ clang++ -cc1 -ast-dump -fcolor-diagnostics -std=c++20 test.cpp `-CXXRecordDecl 0x1530f3a78 &lt;test.cpp:11:1, col:10&gt; col:7 class A definition
|-FullComment 0x1530fea38 &lt;line:2:4, line:9:23&gt; | |-ParagraphComment 0x1530fe7e0 &lt;line:2:4&gt;
| | `-TextComment 0x1530fe7b8 &lt;col:4&gt; Text=" "
| |-BlockCommandComment 0x1530fe800 &lt;col:5, line:3:30&gt; Name="par"
| | `-ParagraphComment 0x1530fe878 &lt;line:2:9, line:3:30&gt;
| | |-TextComment 0x1530fe828 &lt;line:2:9, col:32&gt; Text=" User defined paragraph:"
| | `-TextComment 0x1530fe848 &lt;line:3:4, col:30&gt; Text=" Contents of the paragraph." | |-ParagraphComment 0x1530fe8c0 &lt;line:5:4&gt;
| | `-TextComment 0x1530fe898 &lt;col:4&gt; Text=" "
| |-BlockCommandComment 0x1530fe8e0 &lt;col:5, line:6:41&gt; Name="par"
| | `-ParagraphComment 0x1530fe930 &lt;col:4, col:41&gt;
| | `-TextComment 0x1530fe908 &lt;col:4, col:41&gt; Text=" New paragraph under the same heading." | |-ParagraphComment 0x1530fe978 &lt;line:8:4&gt;
| | `-TextComment 0x1530fe950 &lt;col:4&gt; Text=" "
|
|
|
`-CXXRecordDecl 0x1530f3bb0 &lt;line:11:1, col:7&gt; col:7 implicit class A
`-BlockCommandComment 0x1530fe998 &lt;col:5, line:9:23&gt; Name="par" `-ParagraphComment 0x1530fe9e8 &lt;col:4, col:23&gt;
`-TextComment 0x1530fe9c0 &lt;col:4, col:23&gt; Text=" A second paragraph."

As we can see above, the optional paragraph heading ("User defined paragraph") is not an argument of the \par BlockCommandComment, but instead a child TextComment.

For documentation generators like hdoc, it would be ideal if we could parse Doxygen documentation comments with these semantics in mind. Currently that's not possible.

Change

This change parses \par command according to how Doxygen parses them, making an optional header available as a an argument if it is present. In addition:

  • AST unit tests are defined to test this functionality when an argument is present, isn't present, with additional spacing, etc.
  • TableGen is updated with an IsParCommand to support this functionality
  • lit tests are updated where needed

Full diff: https://github.com/llvm/llvm-project/pull/91100.diff

7 Files Affected:

  • (modified) clang/include/clang/AST/CommentCommandTraits.h (+4)
  • (modified) clang/include/clang/AST/CommentCommands.td (+2-1)
  • (modified) clang/include/clang/AST/CommentParser.h (+3-1)
  • (modified) clang/lib/AST/CommentParser.cpp (+77)
  • (modified) clang/test/Index/comment-misc-tags.m (+3-5)
  • (modified) clang/unittests/AST/CommentParser.cpp (+138-1)
  • (modified) clang/utils/TableGen/ClangCommentCommandInfoEmitter.cpp (+1)
diff --git a/clang/include/clang/AST/CommentCommandTraits.h b/clang/include/clang/AST/CommentCommandTraits.h
index 0c3254d84eb000..78c484fff3aede 100644
--- a/clang/include/clang/AST/CommentCommandTraits.h
+++ b/clang/include/clang/AST/CommentCommandTraits.h
@@ -88,6 +88,10 @@ struct CommandInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned IsHeaderfileCommand : 1;
 
+  /// True if this is a \\par command.
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned IsParCommand : 1;
+
   /// True if we don't want to warn about this command being passed an empty
   /// paragraph.  Meaningful only for block commands.
   LLVM_PREFERRED_TYPE(bool)
diff --git a/clang/include/clang/AST/CommentCommands.td b/clang/include/clang/AST/CommentCommands.td
index e839031752cdd8..5fd687b0d8991f 100644
--- a/clang/include/clang/AST/CommentCommands.td
+++ b/clang/include/clang/AST/CommentCommands.td
@@ -18,6 +18,7 @@ class Command<string name> {
   bit IsThrowsCommand = 0;
   bit IsDeprecatedCommand = 0;
   bit IsHeaderfileCommand = 0;
+  bit IsParCommand = 0;
 
   bit IsEmptyParagraphAllowed = 0;
 
@@ -156,7 +157,7 @@ def Date       : BlockCommand<"date">;
 def Invariant  : BlockCommand<"invariant">;
 def Li         : BlockCommand<"li">;
 def Note       : BlockCommand<"note">;
-def Par        : BlockCommand<"par">;
+def Par        : BlockCommand<"par"> { let IsParCommand = 1; let NumArgs = 1; }
 def Post       : BlockCommand<"post">;
 def Pre        : BlockCommand<"pre">;
 def Remark     : BlockCommand<"remark">;
diff --git a/clang/include/clang/AST/CommentParser.h b/clang/include/clang/AST/CommentParser.h
index e11e818b1af0a1..b5f1c6c19f0ce7 100644
--- a/clang/include/clang/AST/CommentParser.h
+++ b/clang/include/clang/AST/CommentParser.h
@@ -100,6 +100,9 @@ class Parser {
   ArrayRef<Comment::Argument>
   parseCommandArgs(TextTokenRetokenizer &Retokenizer, unsigned NumArgs);
 
+  ArrayRef<Comment::Argument>
+  parseParCommandArgs(TextTokenRetokenizer &Retokenizer, unsigned NumArgs);
+
   BlockCommandComment *parseBlockCommand();
   InlineCommandComment *parseInlineCommand();
 
@@ -118,4 +121,3 @@ class Parser {
 } // end namespace clang
 
 #endif
-
diff --git a/clang/lib/AST/CommentParser.cpp b/clang/lib/AST/CommentParser.cpp
index 8adfd85d0160c3..54760f5ba932eb 100644
--- a/clang/lib/AST/CommentParser.cpp
+++ b/clang/lib/AST/CommentParser.cpp
@@ -149,6 +149,63 @@ class TextTokenRetokenizer {
     addToken();
   }
 
+  /// Check if this line starts with @par or \par
+  bool startsWithParCommand() {
+    unsigned Offset = 1;
+
+    /// Skip all whitespace characters at the beginning.
+    /// This needs to backtrack because Pos has already advanced past the
+    /// actual \par or @par command by the time this function is called.
+    while (isWhitespace(*(Pos.BufferPtr - Offset)))
+      Offset++;
+
+    /// Check if next four characters are \par or @par
+    llvm::StringRef LineStart(Pos.BufferPtr - 5, 4);
+    return LineStart.starts_with("\\par") || LineStart.starts_with("@par");
+  }
+
+  /// Extract a par command argument-header.
+  bool lexParHeading(Token &Tok) {
+    if (isEnd())
+      return false;
+
+    Position SavedPos = Pos;
+
+    consumeWhitespace();
+    SmallString<32> WordText;
+    const char *WordBegin = Pos.BufferPtr;
+    SourceLocation Loc = getSourceLocation();
+
+    if (!startsWithParCommand())
+      return false;
+
+    // Read until the end of this token, which is effectively the end of the
+    // line This gets us the content of the par header, if there is one.
+    while (!isEnd()) {
+      WordText.push_back(peek());
+      if (Pos.BufferPtr + 1 == Pos.BufferEnd) {
+        consumeChar();
+        break;
+      } else {
+        consumeChar();
+      }
+    }
+
+    const unsigned Length = WordText.size();
+    if (Length == 0) {
+      Pos = SavedPos;
+      return false;
+    }
+
+    char *TextPtr = Allocator.Allocate<char>(Length + 1);
+
+    memcpy(TextPtr, WordText.c_str(), Length + 1);
+    StringRef Text = StringRef(TextPtr, Length);
+
+    formTokenWithChars(Tok, Loc, WordBegin, Length, Text);
+    return true;
+  }
+
   /// Extract a word -- sequence of non-whitespace characters.
   bool lexWord(Token &Tok) {
     if (isEnd())
@@ -304,6 +361,23 @@ Parser::parseCommandArgs(TextTokenRetokenizer &Retokenizer, unsigned NumArgs) {
   return llvm::ArrayRef(Args, ParsedArgs);
 }
 
+ArrayRef<Comment::Argument>
+Parser::parseParCommandArgs(TextTokenRetokenizer &Retokenizer,
+                            unsigned NumArgs) {
+  auto *Args = new (Allocator.Allocate<Comment::Argument>(NumArgs))
+      Comment::Argument[NumArgs];
+  unsigned ParsedArgs = 0;
+  Token Arg;
+
+  while (ParsedArgs < NumArgs && Retokenizer.lexParHeading(Arg)) {
+    Args[ParsedArgs] = Comment::Argument{
+        SourceRange(Arg.getLocation(), Arg.getEndLocation()), Arg.getText()};
+    ParsedArgs++;
+  }
+
+  return llvm::ArrayRef(Args, ParsedArgs);
+}
+
 BlockCommandComment *Parser::parseBlockCommand() {
   assert(Tok.is(tok::backslash_command) || Tok.is(tok::at_command));
 
@@ -356,6 +430,9 @@ BlockCommandComment *Parser::parseBlockCommand() {
       parseParamCommandArgs(PC, Retokenizer);
     else if (TPC)
       parseTParamCommandArgs(TPC, Retokenizer);
+    else if (Info->IsParCommand)
+      S.actOnBlockCommandArgs(BC,
+                              parseParCommandArgs(Retokenizer, Info->NumArgs));
     else
       S.actOnBlockCommandArgs(BC, parseCommandArgs(Retokenizer, Info->NumArgs));
 
diff --git a/clang/test/Index/comment-misc-tags.m b/clang/test/Index/comment-misc-tags.m
index 47ee9d9aa392ab..6d018dbfcf193d 100644
--- a/clang/test/Index/comment-misc-tags.m
+++ b/clang/test/Index/comment-misc-tags.m
@@ -91,18 +91,16 @@ @interface IOCommandGate
   
 struct Test {int filler;};
 
-// CHECK:       (CXComment_BlockCommand CommandName=[par]
+// CHECK:       (CXComment_BlockCommand CommandName=[par] Arg[0]=User defined paragraph:
 // CHECK-NEXT:     (CXComment_Paragraph
-// CHECK-NEXT:        (CXComment_Text Text=[ User defined paragraph:] HasTrailingNewline)
 // CHECK-NEXT:        (CXComment_Text Text=[ Contents of the paragraph.])))
 // CHECK:       (CXComment_BlockCommand CommandName=[par]
 // CHECK-NEXT:     (CXComment_Paragraph
-// CHECK-NEXT:        (CXComment_Text Text=[ New paragraph under the same heading.])))
+// CHECK-NEXT:        (CXComment_Text Text=[New paragraph under the same heading.])))
 // CHECK:       (CXComment_BlockCommand CommandName=[note]
 // CHECK-NEXT:     (CXComment_Paragraph
 // CHECK-NEXT:        (CXComment_Text Text=[ This note consists of two paragraphs.] HasTrailingNewline)
 // CHECK-NEXT:        (CXComment_Text Text=[ This is the first paragraph.])))
 // CHECK:       (CXComment_BlockCommand CommandName=[par]
 // CHECK-NEXT:     (CXComment_Paragraph
-// CHECK-NEXT:     (CXComment_Text Text=[ And this is the second paragraph.])))
-
+// CHECK-NEXT:     (CXComment_Text Text=[And this is the second paragraph.])))
diff --git a/clang/unittests/AST/CommentParser.cpp b/clang/unittests/AST/CommentParser.cpp
index c3479672ae2a3c..d9150a848fcaf3 100644
--- a/clang/unittests/AST/CommentParser.cpp
+++ b/clang/unittests/AST/CommentParser.cpp
@@ -1427,8 +1427,145 @@ TEST_F(CommentParserTest, Deprecated) {
   }
 }
 
+TEST_F(CommentParserTest, ParCommandHasArg1) {
+  const char *Sources[] = {
+      "/// @par Paragraph header:",     "/// @par Paragraph header:\n",
+      "/// @par Paragraph header:\r\n", "/// @par Paragraph header:\n\r",
+      "/** @par Paragraph header:*/",
+  };
+
+  for (size_t i = 0, e = std::size(Sources); i != e; i++) {
+    FullComment *FC = parseString(Sources[i]);
+    ASSERT_TRUE(HasChildCount(FC, 2));
+
+    ASSERT_TRUE(HasParagraphCommentAt(FC, 0, " "));
+    {
+      BlockCommandComment *BCC;
+      ParagraphComment *PC;
+      ASSERT_TRUE(HasBlockCommandAt(FC, Traits, 1, BCC, "par", PC));
+      ASSERT_TRUE(HasChildCount(PC, 0));
+      ASSERT_TRUE(BCC->getNumArgs() == 1);
+      ASSERT_TRUE(BCC->getArgText(0) == "Paragraph header:");
+    }
+  }
+}
+
+TEST_F(CommentParserTest, ParCommandHasArg2) {
+  const char *Sources[] = {
+      "/// @par Paragraph header: ",     "/// @par Paragraph header: \n",
+      "/// @par Paragraph header: \r\n", "/// @par Paragraph header: \n\r",
+      "/** @par Paragraph header: */",
+  };
+
+  for (size_t i = 0, e = std::size(Sources); i != e; i++) {
+    FullComment *FC = parseString(Sources[i]);
+    ASSERT_TRUE(HasChildCount(FC, 2));
+
+    ASSERT_TRUE(HasParagraphCommentAt(FC, 0, " "));
+    {
+      BlockCommandComment *BCC;
+      ParagraphComment *PC;
+      ASSERT_TRUE(HasBlockCommandAt(FC, Traits, 1, BCC, "par", PC));
+      ASSERT_TRUE(HasChildCount(PC, 0));
+      ASSERT_TRUE(BCC->getNumArgs() == 1);
+      ASSERT_TRUE(BCC->getArgText(0) == "Paragraph header: ");
+    }
+  }
+}
+
+TEST_F(CommentParserTest, ParCommandHasArg3) {
+  const char *Sources[] = {
+      ("/// @par Paragraph header:\n"
+       "/// Paragraph body"),
+      ("/// @par Paragraph header:\r\n"
+       "/// Paragraph body"),
+      ("/// @par Paragraph header:\n\r"
+       "/// Paragraph body"),
+  };
+
+  for (size_t i = 0, e = std::size(Sources); i != e; i++) {
+    FullComment *FC = parseString(Sources[i]);
+    ASSERT_TRUE(HasChildCount(FC, 2));
+
+    ASSERT_TRUE(HasParagraphCommentAt(FC, 0, " "));
+    {
+      BlockCommandComment *BCC;
+      ParagraphComment *PC;
+      TextComment *TC;
+      ASSERT_TRUE(HasBlockCommandAt(FC, Traits, 1, BCC, "par", PC));
+      ASSERT_TRUE(HasChildCount(PC, 1));
+      ASSERT_TRUE(BCC->getNumArgs() == 1);
+      ASSERT_TRUE(BCC->getArgText(0) == "Paragraph header:");
+      ASSERT_TRUE(GetChildAt(PC, 0, TC));
+      ASSERT_TRUE(TC->getText() == " Paragraph body");
+    }
+  }
+}
+
+TEST_F(CommentParserTest, ParCommandHasArg4) {
+  const char *Sources[] = {
+      ("/// @par Paragraph header:\n"
+       "/// Paragraph body1\n"
+       "/// Paragraph body2"),
+      ("/// @par Paragraph header:\r\n"
+       "/// Paragraph body1\n"
+       "/// Paragraph body2"),
+      ("/// @par Paragraph header:\n\r"
+       "/// Paragraph body1\n"
+       "/// Paragraph body2"),
+  };
+
+  for (size_t i = 0, e = std::size(Sources); i != e; i++) {
+    FullComment *FC = parseString(Sources[i]);
+    ASSERT_TRUE(HasChildCount(FC, 2));
+
+    ASSERT_TRUE(HasParagraphCommentAt(FC, 0, " "));
+    {
+      BlockCommandComment *BCC;
+      ParagraphComment *PC;
+      TextComment *TC;
+      ASSERT_TRUE(HasBlockCommandAt(FC, Traits, 1, BCC, "par", PC));
+      ASSERT_TRUE(HasChildCount(PC, 2));
+      ASSERT_TRUE(BCC->getNumArgs() == 1);
+      ASSERT_TRUE(BCC->getArgText(0) == "Paragraph header:");
+      ASSERT_TRUE(GetChildAt(PC, 0, TC));
+      ASSERT_TRUE(TC->getText() == " Paragraph body1");
+      ASSERT_TRUE(GetChildAt(PC, 1, TC));
+      ASSERT_TRUE(TC->getText() == " Paragraph body2");
+    }
+  }
+}
+
+TEST_F(CommentParserTest, ParCommandHasArg5) {
+  const char *Sources[] = {
+      ("/// @par \n"
+       "/// Paragraphs with no text before newline have no heading"),
+      ("/// @par \r\n"
+       "/// Paragraphs with no text before newline have no heading"),
+      ("/// @par \n\r"
+       "/// Paragraphs with no text before newline have no heading"),
+  };
+
+  for (size_t i = 0, e = std::size(Sources); i != e; i++) {
+    FullComment *FC = parseString(Sources[i]);
+    ASSERT_TRUE(HasChildCount(FC, 2));
+
+    ASSERT_TRUE(HasParagraphCommentAt(FC, 0, " "));
+    {
+      BlockCommandComment *BCC;
+      ParagraphComment *PC;
+      TextComment *TC;
+      ASSERT_TRUE(HasBlockCommandAt(FC, Traits, 1, BCC, "par", PC));
+      ASSERT_TRUE(HasChildCount(PC, 1));
+      ASSERT_TRUE(BCC->getNumArgs() == 0);
+      ASSERT_TRUE(GetChildAt(PC, 0, TC));
+      ASSERT_TRUE(TC->getText() ==
+                  "Paragraphs with no text before newline have no heading");
+    }
+  }
+}
+
 } // unnamed namespace
 
 } // end namespace comments
 } // end namespace clang
-
diff --git a/clang/utils/TableGen/ClangCommentCommandInfoEmitter.cpp b/clang/utils/TableGen/ClangCommentCommandInfoEmitter.cpp
index a113b02e19995d..07b26dc2f6b8be 100644
--- a/clang/utils/TableGen/ClangCommentCommandInfoEmitter.cpp
+++ b/clang/utils/TableGen/ClangCommentCommandInfoEmitter.cpp
@@ -44,6 +44,7 @@ void clang::EmitClangCommentCommandInfo(RecordKeeper &Records,
        << Tag.getValueAsBit("IsThrowsCommand") << ", "
        << Tag.getValueAsBit("IsDeprecatedCommand") << ", "
        << Tag.getValueAsBit("IsHeaderfileCommand") << ", "
+       << Tag.getValueAsBit("IsParCommand") << ", "
        << Tag.getValueAsBit("IsEmptyParagraphAllowed") << ", "
        << Tag.getValueAsBit("IsVerbatimBlockCommand") << ", "
        << Tag.getValueAsBit("IsVerbatimBlockEndCommand") << ", "

@hdoc hdoc force-pushed the dev/par-command-argument-support branch from ff4e6f3 to 2b18453 Compare May 5, 2024 02:51
@cor3ntin cor3ntin requested review from cor3ntin and AaronBallman May 6, 2024 13:41
@hdoc hdoc force-pushed the dev/par-command-argument-support branch 2 times, most recently from 84dd803 to 362080b Compare June 15, 2024 07:34
@hdoc
Copy link
Contributor Author

hdoc commented Jun 15, 2024

Ping. This is rebased on latest main (which integrates the other comment parser changes we've made) and should be ready to go.

Copy link

github-actions bot commented Jun 15, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@hdoc hdoc force-pushed the dev/par-command-argument-support branch from c6d4b64 to ecadf98 Compare June 15, 2024 20:53
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please be sure to add a release note to clang/docs/ReleaseNotes.rst so users know about the new functionality.

@hdoc hdoc force-pushed the dev/par-command-argument-support branch from 487cb53 to 7db1807 Compare June 19, 2024 02:38
@hdoc
Copy link
Contributor Author

hdoc commented Jun 19, 2024

@AaronBallman thanks for the reminder :) the release notes blurb has been added

@AaronBallman AaronBallman merged commit af6acd7 into llvm:main Jun 20, 2024
8 checks passed
Copy link

@hdoc Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
llvm#91100)

### Background

Doxygen's `\par` command
([link](https://www.doxygen.nl/manual/commands.html#cmdpar)) has an
optional argument, which denotes the header of the paragraph started by
a given `\par` command.

In short, the paragraph command can be used with a heading, or without
one. The code block below shows both forms and how the current version
of LLVM/Clang parses this code:
```
$ cat test.cpp
/// \par User defined paragraph:
/// Contents of the paragraph.
///
/// \par
/// New paragraph under the same heading.
///
/// \par
/// A second paragraph.
class A {};

$ clang++ -cc1 -ast-dump -fcolor-diagnostics -std=c++20 test.cpp
`-CXXRecordDecl 0x1530f3a78 <test.cpp:11:1, col:10> col:7 class A definition
  |-FullComment 0x1530fea38 <line:2:4, line:9:23>
  | |-ParagraphComment 0x1530fe7e0 <line:2:4>
  | | `-TextComment 0x1530fe7b8 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe800 <col:5, line:3:30> Name="par"
  | | `-ParagraphComment 0x1530fe878 <line:2:9, line:3:30>
  | |   |-TextComment 0x1530fe828 <line:2:9, col:32> Text=" User defined paragraph:"
  | |   `-TextComment 0x1530fe848 <line:3:4, col:30> Text=" Contents of the paragraph."
  | |-ParagraphComment 0x1530fe8c0 <line:5:4>
  | | `-TextComment 0x1530fe898 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe8e0 <col:5, line:6:41> Name="par"
  | | `-ParagraphComment 0x1530fe930 <col:4, col:41>
  | |   `-TextComment 0x1530fe908 <col:4, col:41> Text=" New paragraph under the same heading."
  | |-ParagraphComment 0x1530fe978 <line:8:4>
  | | `-TextComment 0x1530fe950 <col:4> Text=" "
  | `-BlockCommandComment 0x1530fe998 <col:5, line:9:23> Name="par"
  |   `-ParagraphComment 0x1530fe9e8 <col:4, col:23>
  |     `-TextComment 0x1530fe9c0 <col:4, col:23> Text=" A second paragraph."
  `-CXXRecordDecl 0x1530f3bb0 <line:11:1, col:7> col:7 implicit class A
```

As we can see above, the optional paragraph heading (`"User defined
paragraph"`) is not an argument of the `\par` `BlockCommandComment`, but
instead a child `TextComment`.

For documentation generators like [hdoc](https://hdoc.io/), it would be
ideal if we could parse Doxygen documentation comments with these
semantics in mind. Currently that's not possible.

### Change

This change parses `\par` command according to how Doxygen parses them,
making an optional header available as a an argument if it is present.
In addition:

- AST unit tests are defined to test this functionality when an argument
is present, isn't present, with additional spacing, etc.
- TableGen is updated with an `IsParCommand` to support this
functionality
- `lit` tests are updated where needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants