Skip to content

clang-format: Add IncludeSortKey option #137840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

DaanDeMeyer
Copy link
Contributor

@DaanDeMeyer DaanDeMeyer commented Apr 29, 2025

Sorting by stem gives nicer results when various header file names
are substrings of other header file names, for example, a CLI application
with a main header named analyze.h and a analyze-xxx.h header for each
subcommand currently will always put analyze.h last after all the
analyze-xxx.h headers, but putting analyze.h first instead of last is arguable
nicer to read.

TLDR; Instead of

#include "analyze-blame.h"
#include "analyze.h"

You'd get

#include "analyze.h"
#include "analyze-blame.h"

Let's allow sorting by stem instead of full path by introducing a new
option IncludeSortKey with two values "Path" and "Stem".

@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-format

Author: Daan De Meyer (DaanDeMeyer)

Changes

Sorting by stem gives nicer results when various header file names are substrings of other header file names, for example, a CLI application with a main header named analyze.h and a analyze-xxx.h header for each subcommand currently will always put analyze.h last after all the analyze-xxx.h headers, but putting analyze.h first instead of last is arguable nicer to read.

TLDR; Instead of

#include "analyze-blame.h"
#include "analyze.h"

We'll now get

#include "analyze.h"
#include "analyze-blame.h"

Full diff: https://github.com/llvm/llvm-project/pull/137840.diff

1 Files Affected:

  • (modified) clang/lib/Format/Format.cpp (+10-8)
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 5a1c3f556b331..bc1e681c9be78 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -3219,17 +3219,19 @@ static void sortCppIncludes(const FormatStyle &Style,
 
   if (Style.SortIncludes == FormatStyle::SI_CaseInsensitive) {
     stable_sort(Indices, [&](unsigned LHSI, unsigned RHSI) {
-      const auto LHSFilenameLower = Includes[LHSI].Filename.lower();
-      const auto RHSFilenameLower = Includes[RHSI].Filename.lower();
-      return std::tie(Includes[LHSI].Priority, LHSFilenameLower,
-                      Includes[LHSI].Filename) <
-             std::tie(Includes[RHSI].Priority, RHSFilenameLower,
-                      Includes[RHSI].Filename);
+      const auto LHSStem = llvm::sys::path::stem(Includes[LHSI].Filename);
+      const auto RHSStem = llvm::sys::path::stem(Includes[RHSI].Filename);
+      const auto LHSStemLower = LHSStem.lower();
+      const auto RHSStemLower = RHSStem.lower();
+      return std::tie(Includes[LHSI].Priority, LHSStemLower, LHSStem) <
+             std::tie(Includes[RHSI].Priority, RHSStemLower, RHSStem);
     });
   } else {
     stable_sort(Indices, [&](unsigned LHSI, unsigned RHSI) {
-      return std::tie(Includes[LHSI].Priority, Includes[LHSI].Filename) <
-             std::tie(Includes[RHSI].Priority, Includes[RHSI].Filename);
+      const auto LHSStem = llvm::sys::path::stem(Includes[LHSI].Filename);
+      const auto RHSStem = llvm::sys::path::stem(Includes[RHSI].Filename);
+      return std::tie(Includes[LHSI].Priority, LHSStem) <
+             std::tie(Includes[RHSI].Priority, RHSStem);
     });
   }
 

Copy link
Contributor

@HazardyKnusperkeks HazardyKnusperkeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You definitely need some tests. And I'm not sure if everyone would like that, of if it should be an option.

@DaanDeMeyer
Copy link
Contributor Author

You definitely need some tests. And I'm not sure if everyone would like that, of if it should be an option.

I was hoping to get some feedback on whether it should be an option or not before adding tests.

DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 29, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 29, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 29, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 29, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 30, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
DaanDeMeyer added a commit to DaanDeMeyer/systemd that referenced this pull request Apr 30, 2025
This was done by running a locally built clang-format with
llvm/llvm-project#137617 and
llvm/llvm-project#137840 applied on all .c
and .h files.
Copy link
Contributor

@HazardyKnusperkeks HazardyKnusperkeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think should files be handled, that only differ in the caseness of the extension?

@HazardyKnusperkeks
Copy link
Contributor

You definitely need some tests. And I'm not sure if everyone would like that, of if it should be an option.

I was hoping to get some feedback on whether it should be an option or not before adding tests.

I would be in favor.

@owenca
Copy link
Contributor

owenca commented May 1, 2025

IMO, unless it's a bug, we need a new option to ensure it's a non-breaking change.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label May 1, 2025
@DaanDeMeyer DaanDeMeyer changed the title clang-format: Sort includes by stem rather than full filename clang-format: Add IncludeSortKey option May 1, 2025
@DaanDeMeyer
Copy link
Contributor Author

@HazardyKnusperkeks @owenca Updated PR to introduce a new option IncludeSortKey, PTAL!

Copy link

github-actions bot commented May 1, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@DaanDeMeyer
Copy link
Contributor Author

Ping @owenca @HazardyKnusperkeks

@owenca
Copy link
Contributor

owenca commented May 13, 2025

I would expect that file extensions be sorted regardless. For example:

#include "a.h"
#include "a.inc"
#include "a-util.def"
#include "a-util.h"

@DaanDeMeyer
Copy link
Contributor Author

I would expect that file extensions be sorted regardless. For example:

#include "a.h"
#include "a.inc"
#include "a-util.def"
#include "a-util.h"

Addressed

@DaanDeMeyer
Copy link
Contributor Author

@owenca Thanks for the detailed review. Changed the name to IncludeSortIgnoreExtension, let me know if you have a better idea for the name.

owenca added a commit to owenca/llvm-project that referenced this pull request May 19, 2025
This allows adding other suboptions e.g. IgnoreExtension in llvm#137840.
owenca added a commit that referenced this pull request May 19, 2025
This allows adding other suboptions e.g. IgnoreExtension in #137840.
@DaanDeMeyer
Copy link
Contributor Author

@owenca @HazardyKnusperkeks Rebased onto latest main to take advantage of @owenca's rework and addressed comments.

@DaanDeMeyer DaanDeMeyer force-pushed the stem branch 2 times, most recently from 33f84b2 to beab0ec Compare May 19, 2025 11:44
Copy link
Contributor

@mydeveloperday mydeveloperday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the correct current default

@@ -1647,7 +1647,7 @@ FormatStyle getLLVMStyle(FormatStyle::LanguageKind Language) {
LLVMStyle.SeparateDefinitionBlocks = FormatStyle::SDS_Leave;
LLVMStyle.ShortNamespaceLines = 1;
LLVMStyle.SkipMacroDefinitionBody = false;
LLVMStyle.SortIncludes = {/*Enabled=*/true, /*IgnoreCase=*/false};
LLVMStyle.SortIncludes = {/*Enabled=*/true, /*IgnoreCase=*/false, /*IgnoreExtension=*/false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand correctly... true is the current order not false, so the default should be true otherwise you'll cause every user of clang-format to have to add the IgnoreExtension setting if they don't want a change, am I correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mydeveloperday The default is false, by default, the extension is not ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the documentation you show, neither true/false gives you what clang-format gives us today (with just BasedOnStyle: LLVM

which is ...

#include "A.h"
#include "A-util.h"
#include "A.inc"

So lets assume I don't what your feature... how do I keep my current format? what should it be set to true or false?

       true:                      false:
       # include "A.h"             # include "A-util.h"
       # include "A.inc"           # include "A.h"
       # include "A-util.h"        # include "A.inc"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"currently will always put analyze.h last after all the analyze-xxx.h headers, but putting analyze.h first instead of last is arguable nicer to read"

How come for me its doesn't do this as you say...

seems your example might not be the best for the documentation, seems like this is dependent on the basename length being greater than 1 in length. Ok so thats confusing...the documentation should represent what happens.

#include "ab-beta.h"
#include "ab-data.h"
#include "ab.h"
#include "a.h"
#include "a-beta.h"
#include "a-data.h"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mydeveloperday Huh, I don't see how the length of the basename would affect this. I pushed more tests to verify this is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include "A.h"
#include "A-util.h"
#include "A.inc"

I don't see how clang-format would ever return this today without disabling include sorting? - sorts before . so A-util.h will always be put before A.h and A.inc today if include sorting is enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the file maybe A.cpp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.. thats it.. it was A.cxx actually..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused now what the suggestion is that I should apply here to make everyone happy

@DaanDeMeyer DaanDeMeyer force-pushed the stem branch 2 times, most recently from bb5eb38 to 6a3e7f0 Compare May 19, 2025 20:47
Sorting without taking the file extension into account gives nicer results
when various header file names are substrings of other header file names,
for example, a CLI application with a main header named analyze.h and a
analyze-xxx.h header for each subcommand currently will always put analyze.h
last after all the analyze-xxx.h headers, but putting analyze.h first instead
of last is arguable nicer to read.

TLDR; Instead of

"""
/#include "analyze-blame.h"
/#include "analyze.h"
"""

You'd get

"""
/#include "analyze.h"
/#include "analyze-blame.h"
"""

Let's allow sorting without taking the file extension into account unless two
headers otherwise compare equal by introducing a new boolean option IgnoreExtension
for SortIncludesOptions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category clang-format
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants