Skip to content

Disable -fdollars-in-identifiers by default #135407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

AaronBallman
Copy link
Collaborator

Clang used to enable -fdollars-in-identifiers by default for compatibility with GCC. However, this is no longer a conforming extension after WG21 P2558R2 and WG14 N2701.

So this disables the dialect by default, which is a potentially breaking change for users.

Note: some inline assembly constructs may use dollars in identifiers. We cannot enable the dialect mode automatically based on the user passing -fasm-blocks because that flag is implied by -fms-extensions which is enabled by default on Windows, and thus would mean we'd be enabling a non-conforming language extension by default on that platform.

Users impacted by the change should explicitly add -fdollars-in-identifiers to their build scripts.

Partially addresses #128939

Clang used to enable -fdollars-in-identifiers by default for
compatibility with GCC. However, this is no longer a conforming
extension after WG21 P2558R2 and WG14 N2701.

So this disables the dialect by default, which is a potentially
breaking change for users.

Note: some inline assembly constructs may use dollars in identifiers.
We cannot enable the dialect mode automatically based on the user
passing -fasm-blocks because that flag is implied by -fms-extensions
which is enabled by default on Windows, and thus would mean we'd be
enabling a non-conforming language extension by default on that
platform.

Users impacted by the change should explicitly add
-fdollars-in-identifiers to their build scripts.

Partially addresses llvm#128939
@AaronBallman AaronBallman added clang Clang issues not falling into any other category c23 clang:frontend Language frontend issues, e.g. anything involving "Sema" accepts-invalid rejects-valid c++23 labels Apr 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-clang-format

@llvm/pr-subscribers-clang

Author: Aaron Ballman (AaronBallman)

Changes

Clang used to enable -fdollars-in-identifiers by default for compatibility with GCC. However, this is no longer a conforming extension after WG21 P2558R2 and WG14 N2701.

So this disables the dialect by default, which is a potentially breaking change for users.

Note: some inline assembly constructs may use dollars in identifiers. We cannot enable the dialect mode automatically based on the user passing -fasm-blocks because that flag is implied by -fms-extensions which is enabled by default on Windows, and thus would mean we'd be enabling a non-conforming language extension by default on that platform.

Users impacted by the change should explicitly add -fdollars-in-identifiers to their build scripts.

Partially addresses #128939


Full diff: https://github.com/llvm/llvm-project/pull/135407.diff

10 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+5)
  • (modified) clang/include/clang/Basic/LangOptions.def (+1-1)
  • (modified) clang/include/clang/Driver/Options.td (+1-1)
  • (modified) clang/test/AST/ByteCode/codegen.m (+1-1)
  • (modified) clang/test/C/drs/dr0xx.c (+6-6)
  • (modified) clang/test/CodeGen/ms-inline-asm.c (+1-1)
  • (modified) clang/test/CodeGenObjC/extern-void-class-decl.m (+1-1)
  • (modified) clang/test/Lexer/dollar-idents.c (+1-1)
  • (added) clang/test/Lexer/gh128939.cpp (+17)
  • (modified) clang/test/Preprocessor/c90.c (+1-1)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9c45965dc4d82..fe51de6fd5b7c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -48,6 +48,11 @@ C/C++ Language Potentially Breaking Changes
   ensure they are not caught by these optimizations.  It is also possible to use
   ``-fwrapv-pointer`` or   ``-fno-delete-null-pointer-checks`` to make pointer arithmetic
   on null pointers well-defined. (#GH130734, #GH130742, #GH130952)
+- Use of the dollar sign (``$``) in an identifier is no longer a conforming
+  extension in either C or C++, so ``-fdollars-in-identifiers`` is no longer
+  enabled by default. Use of the dollar sign in an identifier will now be
+  diagnosed as an error unless ``-fdollars-in-identifiers`` is explicitly
+  enabled.
 
 C++ Specific Potentially Breaking Changes
 -----------------------------------------
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 3879cc7942877..f08e179a38067 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -119,7 +119,7 @@ LANGOPT(WChar             , 1, 0, "wchar_t keyword")
 LANGOPT(Char8             , 1, 0, "char8_t keyword")
 LANGOPT(IEEE128           , 1, 0, "__ieee128 keyword")
 LANGOPT(DeclSpecKeyword   , 1, 0, "__declspec keyword")
-BENIGN_LANGOPT(DollarIdents   , 1, 1, "'$' in identifiers")
+BENIGN_LANGOPT(DollarIdents   , 1, 0, "'$' in identifiers")
 BENIGN_LANGOPT(AsmPreprocessor, 1, 0, "preprocessor in asm mode")
 LANGOPT(GNUMode           , 1, 1, "GNU extensions")
 LANGOPT(GNUKeywords       , 1, 1, "GNU keywords")
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index c1020b234b136..38eb332f40d27 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -2088,7 +2088,7 @@ def fno_discard_value_names : Flag<["-"], "fno-discard-value-names">,
   Group<f_clang_Group>, Visibility<[ClangOption, DXCOption]>,
   HelpText<"Do not discard value names in LLVM IR">;
 defm dollars_in_identifiers : BoolFOption<"dollars-in-identifiers",
-  LangOpts<"DollarIdents">, Default<!strconcat("!", asm_preprocessor.KeyPath)>,
+  LangOpts<"DollarIdents">, DefaultFalse,
   PosFlag<SetTrue, [], [ClangOption], "Allow">,
   NegFlag<SetFalse, [], [ClangOption], "Disallow">,
   BothFlags<[], [ClangOption, CC1Option], " '$' in identifiers">>;
diff --git a/clang/test/AST/ByteCode/codegen.m b/clang/test/AST/ByteCode/codegen.m
index 6139596c6337a..a7b3a100165eb 100644
--- a/clang/test/AST/ByteCode/codegen.m
+++ b/clang/test/AST/ByteCode/codegen.m
@@ -3,7 +3,7 @@
 /// See test/CodeGenObjC/constant-strings.m
 /// Test that we let the APValue we create for ObjCStringLiterals point to the right expression.
 
-// RUN: %clang_cc1 -triple x86_64-macho -emit-llvm -o %t %s -fexperimental-new-constant-interpreter
+// RUN: %clang_cc1 -triple x86_64-macho -fdollars-in-identifiers  -emit-llvm -o %t %s -fexperimental-new-constant-interpreter
 // RUN: FileCheck --check-prefix=CHECK-NEXT < %t %s
 
 // Check that we set alignment 1 on the string.
diff --git a/clang/test/C/drs/dr0xx.c b/clang/test/C/drs/dr0xx.c
index 5fe023deaece9..ffcd63b0cc9a7 100644
--- a/clang/test/C/drs/dr0xx.c
+++ b/clang/test/C/drs/dr0xx.c
@@ -1,9 +1,9 @@
-/* RUN: %clang_cc1 -std=c89 -fsyntax-only -verify=expected,c89only -pedantic -Wno-declaration-after-statement -Wno-c11-extensions %s
-   RUN: %clang_cc1 -std=c89 -fsyntax-only -verify=expected,c89only -pedantic -Wno-declaration-after-statement -Wno-c11-extensions -fno-signed-char %s
-   RUN: %clang_cc1 -std=c99 -fsyntax-only -verify=expected,c99untilc2x -pedantic -Wno-c11-extensions %s
-   RUN: %clang_cc1 -std=c11 -fsyntax-only -verify=expected,c99untilc2x -pedantic %s
-   RUN: %clang_cc1 -std=c17 -fsyntax-only -verify=expected,c99untilc2x -pedantic %s
-   RUN: %clang_cc1 -std=c2x -fsyntax-only -verify=expected,c2xandup -pedantic %s
+/* RUN: %clang_cc1 -std=c89 -fdollars-in-identifiers -fsyntax-only -verify=expected,c89only -pedantic -Wno-declaration-after-statement -Wno-c11-extensions %s
+   RUN: %clang_cc1 -std=c89 -fdollars-in-identifiers -fsyntax-only -verify=expected,c89only -pedantic -Wno-declaration-after-statement -Wno-c11-extensions -fno-signed-char %s
+   RUN: %clang_cc1 -std=c99 -fdollars-in-identifiers -fsyntax-only -verify=expected,c99untilc2x -pedantic -Wno-c11-extensions %s
+   RUN: %clang_cc1 -std=c11 -fdollars-in-identifiers -fsyntax-only -verify=expected,c99untilc2x -pedantic %s
+   RUN: %clang_cc1 -std=c17 -fdollars-in-identifiers -fsyntax-only -verify=expected,c99untilc2x -pedantic %s
+   RUN: %clang_cc1 -std=c2x -fdollars-in-identifiers -fsyntax-only -verify=expected,c2xandup -pedantic %s
  */
 
 /* The following are DRs which do not require tests to demonstrate
diff --git a/clang/test/CodeGen/ms-inline-asm.c b/clang/test/CodeGen/ms-inline-asm.c
index c3eef9a23e166..811e3ccd2a89a 100644
--- a/clang/test/CodeGen/ms-inline-asm.c
+++ b/clang/test/CodeGen/ms-inline-asm.c
@@ -1,5 +1,5 @@
 // REQUIRES: x86-registered-target
-// RUN: %clang_cc1 %s -triple i386-apple-darwin10 -fasm-blocks -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s -triple i386-apple-darwin10 -fdollars-in-identifiers -fasm-blocks -emit-llvm -o - | FileCheck %s
 
 void t1(void) {
 // CHECK: @t1
diff --git a/clang/test/CodeGenObjC/extern-void-class-decl.m b/clang/test/CodeGenObjC/extern-void-class-decl.m
index 826622b94c1bb..55b2f9565439a 100644
--- a/clang/test/CodeGenObjC/extern-void-class-decl.m
+++ b/clang/test/CodeGenObjC/extern-void-class-decl.m
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple x86_64-apple-macosx10.14.0 %s -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-apple-macosx10.14.0 -fdollars-in-identifiers %s -emit-llvm -o - | FileCheck %s
 
 extern void OBJC_CLASS_$_f;
 Class c = (Class)&OBJC_CLASS_$_f;
diff --git a/clang/test/Lexer/dollar-idents.c b/clang/test/Lexer/dollar-idents.c
index a1263b4e572cc..6fd32a13bc591 100644
--- a/clang/test/Lexer/dollar-idents.c
+++ b/clang/test/Lexer/dollar-idents.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -dump-tokens %s 2>&1 | FileCheck %s
+// RUN: %clang_cc1 -dump-tokens -fdollars-in-identifiers %s 2>&1 | FileCheck %s
 // RUN: %clang_cc1 -dump-tokens -x assembler-with-cpp %s 2>&1 | FileCheck %s --check-prefix=CHECK-ASM
 // PR3808
 
diff --git a/clang/test/Lexer/gh128939.cpp b/clang/test/Lexer/gh128939.cpp
new file mode 100644
index 0000000000000..8abd8c8c6ec0a
--- /dev/null
+++ b/clang/test/Lexer/gh128939.cpp
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -E -fdollars-in-identifiers %s 2>&1 | FileCheck %s --check-prefix=CHECK-DOLLARS
+// RUN: %clang_cc1 -E %s 2>&1 | FileCheck %s --check-prefix=CHECK-NO-DOLLARS
+// GH128939
+
+#define FOO$ 10
+#define STR(x) #x
+#define STR2(x) STR(x)
+const char *p = STR2(FOO$);
+
+// CHECK-NO-DOLLARS: const char *p = "$ 10$";
+// CHECK-DOLLARS: const char *p = "10";
+
+#define STR3 STR(
+const char *q = STR3$10);
+
+// CHECK-NO-DOLLARS: const char *q = "$10";
+// CHECK-DOLLARS: const char *q = STR3$10);
diff --git a/clang/test/Preprocessor/c90.c b/clang/test/Preprocessor/c90.c
index 3b9105fe6ee91..2d64c4be1234d 100644
--- a/clang/test/Preprocessor/c90.c
+++ b/clang/test/Preprocessor/c90.c
@@ -7,7 +7,7 @@
 #define foo`bar   /* expected-error {{whitespace required after macro name}} */
 #define foo2!bar  /* expected-warning {{whitespace recommended after macro name}} */
 
-#define foo3$bar  /* expected-error {{'$' in identifier}} */
+#define foo3$bar  /* expected-error {{whitespace required after macro name}} */
 
 /* CHECK-NOT: this comment should be missing
  * CHECK: {{^}}// this comment should be present{{$}}

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to do this, I think we need better diagnostics. Just straight disabling this is going to give very confusing diagnostics to anyone actually using dollar-signs in identifiers.

Some ideas:

  • We can give a warning if we see a "$" adjacent to an identifier without any whitespace separating it.
  • Outside the preprocessor, we can parse a "$" adjacent to an identifier as part of the identifier, with some sort of diagnostic, since it's guaranteed to be an error anyway.

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -1,5 +1,5 @@
// REQUIRES: x86-registered-target
// RUN: %clang_cc1 %s -triple i386-apple-darwin10 -fasm-blocks -emit-llvm -o - | FileCheck %s
// RUN: %clang_cc1 %s -triple i386-apple-darwin10 -fdollars-in-identifiers -fasm-blocks -emit-llvm -o - | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just musing... if we wanted this to work automatically, we could implement detection of __asm in the preprocessor, and enable dollars-in-identifiers when lexing inside an __asm block. Detection of __asm at that phase is also necessary to properly handle ; comments, which right now we incorrectly discard after preprocessing. (I'm not sure if there are any other lexing differences for MS __asm that we don't properly handle, but there might be.) But it's probably not worth it for a feature that only works (in MSVC at least) when targeting 32-bit x86. :)

Perhaps a better idea would be to enable -fdollars-in-identifiers under -fms-compatibility, given that MSVC still enables support for dollars in identifiers even in its conforming mode. Though perhaps we can leave that option as a fallback for if someone actually complains.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a better idea would be to enable -fdollars-in-identifiers under -fms-compatibility

I thought about this but explicitly decided against it because that means dollars in identifiers is enabled by default for anyone using Clang built by MSVC and targeting Windows. I think we may want to see what folks run into; maybe we want to enable it only for clang-cl and not clang? Maybe we want to be more clever than that? I dunno.

Copy link
Contributor

@R-Goc R-Goc Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we want to enable it only for clang-cl and not clang?

This sounds like an even less expected corner case that would be very annoying.

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THANKS!

@AaronBallman
Copy link
Collaborator Author

If we're going to do this, I think we need better diagnostics. Just straight disabling this is going to give very confusing diagnostics to anyone actually using dollar-signs in identifiers.

Some ideas:

* We can give a warning if we see a "$" adjacent to an identifier without any whitespace separating it.

* Outside the preprocessor, we can parse a "$" adjacent to an identifier as part of the identifier, with some sort of diagnostic, since it's guaranteed to be an error anyway.

My concern here is with regressing performance of the lexer; testing those conditions when lexing any identifier seems like we'd be spending a lot of time trying to catch a very uncommon issue.

I think this is reasonable as follow-up work, but I don't think it should hold up this PR because this is fixing a conformance issue. (FWIW, I think the behavior you get currently is something we could live with even if that follow-up work never happened.)

@AaronBallman
Copy link
Collaborator Author

If we're going to do this, I think we need better diagnostics. Just straight disabling this is going to give very confusing diagnostics to anyone actually using dollar-signs in identifiers.
Some ideas:

* We can give a warning if we see a "$" adjacent to an identifier without any whitespace separating it.

* Outside the preprocessor, we can parse a "$" adjacent to an identifier as part of the identifier, with some sort of diagnostic, since it's guaranteed to be an error anyway.

My concern here is with regressing performance of the lexer; testing those conditions when lexing any identifier seems like we'd be spending a lot of time trying to catch a very uncommon issue.

I think this is reasonable as follow-up work, but I don't think it should hold up this PR because this is fixing a conformance issue. (FWIW, I think the behavior you get currently is something we could live with even if that follow-up work never happened.)

Are you okay with me punting on this work and landing the changes as-os @efriedma-quic ?

@efriedma-quic
Copy link
Collaborator

If you don't want to do anything fancy, can you just add a warning in Lexer::LexTokenInternal if we see a '$' (where we set the token type to tok::unknown)? That should be easy to implement, and have zero impact on lexer performance.

@shafik shafik requested a review from JDevlieghere April 15, 2025 20:35
@@ -7,7 +7,7 @@
#define foo`bar /* expected-error {{whitespace required after macro name}} */
#define foo2!bar /* expected-warning {{whitespace recommended after macro name}} */

#define foo3$bar /* expected-error {{'$' in identifier}} */
#define foo3$bar /* expected-error {{whitespace required after macro name}} */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this diagnostic change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In -pedentic-errors mode before the patch, dollar signs in identifiers are recognized and so we get # define <identifier> and issue the pedantic diagnostic. With the patch, dollar signs in identifiers are not recognized at all and so we get # define foo $ bar which has no whitespace between the macro name and its replacement list.

@AaronBallman
Copy link
Collaborator Author

If you don't want to do anything fancy, can you just add a warning in Lexer::LexTokenInternal if we see a '$' (where we set the token type to tok::unknown)? That should be easy to implement, and have zero impact on lexer performance.

Okay, that's pretty reasonable. Note, we only emit the diagnostic when finding $ at the start of an identifier, not as a continuation character. Otherwise we'd issue two diagnostics for foo$bar, one for finding the continuation character and one for finding the start of the next token. I added test coverage to be sure we caught $a, a$, and a$b correctly.

@R-Goc
Copy link
Contributor

R-Goc commented Apr 16, 2025

It appears that gcc only disallows '$' in identifiers in C++26 mode. Not sure how relevant that is to clang. bugzilla Or am I misreading that thread?

@AaronBallman
Copy link
Collaborator Author

It appears that gcc only disallows '$' in identifiers in C++26 mode. Not sure how relevant that is to clang. bugzilla Or am I misreading that thread?

I get the same reading, but I am not certain we want to follow suit unless there's some significant fallout from this change (which I don't expect). Disabling dollars in identifiers in all language modes is a more consistent behavior for the extension instead of having it depend on the language mode.

@R-Goc
Copy link
Contributor

R-Goc commented Apr 16, 2025

Doesn't the Linux kernel use it for syscall names? Might be wrong on this.

@AaronBallman
Copy link
Collaborator Author

Doesn't the Linux kernel use it for syscall names? Might be wrong on this.

CC @nathanchance @nickdesaulniers for more information, but even if the kernel does, is it a significant imposition to pass -fdollars-in-identifiers to enable the extension explicitly?

@nathanchance
Copy link
Member

CC @nathanchance @nickdesaulniers for more information

This will indeed break building the Linux kernel for x86_64 at least in my brief test, mainly because it uses the preprocessor for assembly files but there are some macros that are included in both C and assembly that trigger it as well. For example, all these warnings will get upgraded to errors with CONFIG_WERROR (on by default with defconfig and allmodconfig):

In file included from arch/x86/kernel/acpi/wakeup_64.S:11:
arch/x86/include/asm/nospec-branch.h:80:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   80 |         movq    $-1, PER_CPU_VAR(__x86_call_depth);
      |                 ^
arch/x86/include/asm/nospec-branch.h:84:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   84 |         bts     $63, %rax;                                      \
      |                 ^
arch/x86/include/asm/nospec-branch.h:88:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   88 |         movb    $0xfc, %al;                                     \
      |                 ^
arch/x86/include/asm/nospec-branch.h:89:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   89 |         shl     $56, %rax;                                      \
      |                 ^
arch/x86/include/asm/nospec-branch.h:94:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   94 |         sarq    $5, PER_CPU_VAR(__x86_call_depth);              \
      |                 ^
arch/x86/include/asm/nospec-branch.h:142:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  142 |         mov     $(nr/2), reg;                           \
      |                 ^
arch/x86/include/asm/nospec-branch.h:146:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  146 |         add     $(BITS_PER_LONG/8) * 2, %_ASM_SP;       \
      |                 ^
arch/x86/include/asm/nospec-branch.h:176:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  176 |         add     $(BITS_PER_LONG/8), %_ASM_SP;           \
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:22:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   22 |         movq    $0x123456789abcdef0, %rdx
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:27:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   27 |         movq $0xbad6d61676963, %rcx
      |              ^
arch/x86/kernel/acpi/wakeup_64.S:31:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   31 |         movw    $__KERNEL_DS, %ax
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:51:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   51 |         subq    $8, %rsp
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:55:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   55 |         movq    $saved_context, %rax
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:74:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   74 |         movq    $.Lresume_point, saved_rip(%rip)
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:82:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   82 |         addq    $8, %rsp
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:83:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   83 |         movl    $3, %edi
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:93:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   93 |         movq    $saved_context, %rax
      |                 ^
arch/x86/kernel/acpi/wakeup_64.S:130:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  130 |         addq    $8, %rsp
      |                 ^
18 warnings generated.

In file included from arch/x86/kernel/ftrace_64.S:12:
...
arch/x86/kernel/ftrace_64.S:82:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   82 |         subq $(FRAME_SIZE), %rsp
      |              ^
arch/x86/kernel/ftrace_64.S:90:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   90 |         movq $0, ORIG_RAX(%rsp)
      |              ^
arch/x86/kernel/ftrace_64.S:114:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  114 |         subq $MCOUNT_INSN_SIZE, %rdi
      |              ^
arch/x86/kernel/ftrace_64.S:130:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  130 |         addq $MCOUNT_REG_SIZE-\save, %rsp
      |              ^
arch/x86/kernel/ftrace_64.S:175:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  175 |         movq $0, CS(%rsp)
      |              ^
arch/x86/kernel/ftrace_64.S:229:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  229 |         movq $__KERNEL_DS, %rcx
      |              ^
arch/x86/kernel/ftrace_64.S:231:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  231 |         movq $__KERNEL_CS, %rcx
      |              ^
arch/x86/kernel/ftrace_64.S:307:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  307 |         add $8, %rsp
      |             ^
arch/x86/kernel/ftrace_64.S:358:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  358 |         subq $(FRAME_SIZE), %rsp
      |              ^
arch/x86/kernel/ftrace_64.S:371:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  371 |         addq $(FRAME_SIZE), %rsp
      |              ^
18 warnings generated.

arch/x86/lib/clear_page_64.S:20:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   20 |         movl $4096/8,%ecx
      |              ^
arch/x86/lib/clear_page_64.S:29:9: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   29 |         movl   $4096/64,%ecx
      |                ^
arch/x86/lib/clear_page_64.S:50:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   50 |         movl $4096,%ecx
      |              ^
arch/x86/lib/clear_page_64.S:69:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   69 |         cmpq $64,%rcx
      |              ^
arch/x86/lib/clear_page_64.S:72:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   72 |         cmp $8,%ecx
      |             ^
arch/x86/lib/clear_page_64.S:90:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   90 |         addq $8,%rdi
      |              ^
arch/x86/lib/clear_page_64.S:91:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   91 |         sub $8,%ecx
      |             ^
arch/x86/lib/clear_page_64.S:93:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   93 |         cmp $8,%ecx
      |             ^
arch/x86/lib/clear_page_64.S:107:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  107 |         addq $64,%rdi
      |              ^
arch/x86/lib/clear_page_64.S:108:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  108 |         subq $64,%rcx
      |              ^
arch/x86/lib/clear_page_64.S:109:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  109 |         cmpq $64,%rcx
      |              ^
arch/x86/lib/clear_page_64.S:111:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  111 |         cmpl $8,%ecx
      |              ^
12 warnings generated.

but even if the kernel does, is it a significant imposition to pass -fdollars-in-identifiers to enable the extension explicitly?

I am sure that would not be too much of a problem (as long as I have sufficient justification, it is unfortunate if there is a break from GCC's behavior) but it will take some time to get it merged and backported to all supported releases.

@efriedma-quic
Copy link
Collaborator

We should disable the warning if we're preprocessing asm, which should mostly solve the issue with warnings.

For the Linux kernel, does anything break other than the warnings?

@AaronBallman
Copy link
Collaborator Author

We should disable the warning if we're preprocessing asm, which should mostly solve the issue with warnings.

Effectively checking for LangOpts.DollarIdents || LangOpts.AsmPreprocessor? Or do you think DollarIdents should be true if AsmPreprocessor is true?

@nathanchance
Copy link
Member

We should disable the warning if we're preprocessing asm, which should mostly solve the issue with warnings.

Yes although the warnings from arch/x86/include/asm/nospec-branch.h do show up from .c files so that would only partially fix it.

In file included from drivers/gpio/gpio-brcmstb.c:5:
In file included from include/linux/gpio/driver.h:8:
In file included from include/linux/irqchip/chained_irq.h:10:
In file included from include/linux/irq.h:14:
In file included from include/linux/spinlock.h:59:
In file included from include/linux/irqflags.h:18:
In file included from arch/x86/include/asm/irqflags.h:9:
arch/x86/include/asm/nospec-branch.h:80:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   80 |         movq    $-1, PER_CPU_VAR(__x86_call_depth);
      |                 ^
arch/x86/include/asm/nospec-branch.h:84:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   84 |         bts     $63, %rax;                                      \
      |                 ^
arch/x86/include/asm/nospec-branch.h:88:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   88 |         movb    $0xfc, %al;                                     \
      |                 ^
arch/x86/include/asm/nospec-branch.h:89:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   89 |         shl     $56, %rax;                                      \
      |                 ^
arch/x86/include/asm/nospec-branch.h:94:7: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
   94 |         sarq    $5, PER_CPU_VAR(__x86_call_depth);              \
      |                 ^
arch/x86/include/asm/nospec-branch.h:142:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  142 |         mov     $(nr/2), reg;                           \
      |                 ^
arch/x86/include/asm/nospec-branch.h:146:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  146 |         add     $(BITS_PER_LONG/8) * 2, %_ASM_SP;       \
      |                 ^
arch/x86/include/asm/nospec-branch.h:176:6: warning: '$' in identifier; did you mean to enable '-fdollars-in-identifiers'? [-Wdollar-in-identifier]
  176 |         add     $(BITS_PER_LONG/8), %_ASM_SP;           \
      |                 ^
8 warnings generated.

For the Linux kernel, does anything break other than the warnings?

Does not look like it (aside from one file that does not respect the global -Werror configuration and includes it regardless).

Since I already tested it, I figured I might as well report it anyways. If I add -fdollars-in-identifiers to CLANG_FLAGS, several things break because it looks like the preprocessor no longer properly expands certain macros that were used to define assembly immediate:

arch/x86/platform/pvh/head.S:73:11: error: unexpected token in argument list
 subl $rva(1b), %ebp
          ^
make[6]: *** [scripts/Makefile.build:335: arch/x86/platform/pvh/head.o] Error 1
make[6]: Target 'arch/x86/platform/pvh/' not remade because of errors.
make[5]: *** [scripts/Makefile.build:461: arch/x86/platform/pvh] Error 2

ld.lld: error: undefined symbol: X86_CR0_PE
>>> referenced by wakeup_asm.S:53 (arch/x86/realmode/rm/wakeup_asm.S:53)
>>>               arch/x86/realmode/rm/wakeup_asm.o:(wakeup_start)

ld.lld: error: undefined symbol: WAKEUP_HEADER_SIGNATURE
>>> referenced by wakeup_asm.S:84 (arch/x86/realmode/rm/wakeup_asm.S:84)
>>>               arch/x86/realmode/rm/wakeup_asm.o:(wakeup_start)

ld.lld: error: undefined symbol: REALMODE_END_SIGNATURE
>>> referenced by wakeup_asm.S:89 (arch/x86/realmode/rm/wakeup_asm.S:89)
>>>               arch/x86/realmode/rm/wakeup_asm.o:(wakeup_start)

ld.lld: error: undefined symbol: WAKEUP_BEHAVIOR_RESTORE_MISC_ENABLE
>>> referenced by wakeup_asm.S:98 (arch/x86/realmode/rm/wakeup_asm.S:98)
>>>               arch/x86/realmode/rm/wakeup_asm.o:(wakeup_start)

ld.lld: error: undefined symbol: MSR_IA32_MISC_ENABLE
>>> referenced by wakeup_asm.S:103 (arch/x86/realmode/rm/wakeup_asm.S:103)
>>>               arch/x86/realmode/rm/wakeup_asm.o:(wakeup_start)
>>> referenced by verify_cpu.S:97 (arch/x86/include/../kernel/verify_cpu.S:97)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)

ld.lld: error: undefined symbol: __KERNEL_DS
>>> referenced by trampoline_64.S:89 (arch/x86/realmode/rm/trampoline_64.S:89)
>>>               arch/x86/realmode/rm/trampoline_64.o:(trampoline_start)
>>> referenced by reboot.S:27 (arch/x86/realmode/rm/reboot.S:27)
>>>               arch/x86/realmode/rm/reboot.o:(machine_real_restart_asm)
>>> referenced by trampoline_64.S:199 (arch/x86/realmode/rm/trampoline_64.S:199)
>>>               arch/x86/realmode/rm/trampoline_64.o:(pa_trampoline_compat)
>>> referenced 1 more times

ld.lld: error: undefined symbol: __KERNEL32_CS
>>> referenced by trampoline_64.S:96 (arch/x86/realmode/rm/trampoline_64.S:96)
>>>               arch/x86/realmode/rm/trampoline_64.o:(trampoline_start)
>>> referenced by reboot.S:35 (arch/x86/realmode/rm/reboot.S:35)
>>>               arch/x86/realmode/rm/reboot.o:(machine_real_restart_asm)
>>> referenced by trampoline_64.S:203 (arch/x86/realmode/rm/trampoline_64.S:203)
>>>               arch/x86/realmode/rm/trampoline_64.o:(pa_trampoline_compat)

ld.lld: error: undefined symbol: REQUIRED_MASK0
>>> referenced by verify_cpu.S:106 (arch/x86/include/../kernel/verify_cpu.S:106)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)
>>> referenced by verify_cpu.S:107 (arch/x86/include/../kernel/verify_cpu.S:107)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)

ld.lld: error: undefined symbol: REQUIRED_MASK1
>>> referenced by verify_cpu.S:117 (arch/x86/include/../kernel/verify_cpu.S:117)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)
>>> referenced by verify_cpu.S:118 (arch/x86/include/../kernel/verify_cpu.S:118)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)

ld.lld: error: undefined symbol: SSE_MASK
>>> referenced by verify_cpu.S:124 (arch/x86/include/../kernel/verify_cpu.S:124)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)
>>> referenced by verify_cpu.S:125 (arch/x86/include/../kernel/verify_cpu.S:125)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)

ld.lld: error: undefined symbol: MSR_K7_HWCR
>>> referenced by verify_cpu.S:129 (arch/x86/include/../kernel/verify_cpu.S:129)
>>>               arch/x86/realmode/rm/trampoline_64.o:(verify_cpu)

ld.lld: error: undefined symbol: TH_FLAGS_SME_ACTIVE_BIT
>>> referenced by trampoline_64.S:142 (arch/x86/realmode/rm/trampoline_64.S:142)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)

ld.lld: error: undefined symbol: MSR_AMD64_SYSCFG
>>> referenced by trampoline_64.S:144 (arch/x86/realmode/rm/trampoline_64.S:144)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)

ld.lld: error: undefined symbol: MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT
>>> referenced by trampoline_64.S:146 (arch/x86/realmode/rm/trampoline_64.S:146)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)

ld.lld: error: undefined symbol: MSR_EFER
>>> referenced by reboot.S:40 (arch/x86/realmode/rm/reboot.S:40)
>>>               arch/x86/realmode/rm/reboot.o:(machine_real_restart_asm)
>>> referenced by trampoline_64.S:164 (arch/x86/realmode/rm/trampoline_64.S:164)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)

ld.lld: error: undefined symbol: CR0_STATE
>>> referenced by trampoline_64.S:181 (arch/x86/realmode/rm/trampoline_64.S:181)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)

ld.lld: error: undefined symbol: __KERNEL_CS
>>> referenced by trampoline_64.S:190 (arch/x86/realmode/rm/trampoline_64.S:190)
>>>               arch/x86/realmode/rm/trampoline_64.o:(startup_32)
>>> referenced by trampoline_64.S:246 (arch/x86/realmode/rm/trampoline_64.S:246)
>>>               arch/x86/realmode/rm/trampoline_64.o:(trampoline_start64)

ld.lld: error: undefined symbol: X86_CR4_LA57
>>> referenced by trampoline_64.S:228 (arch/x86/realmode/rm/trampoline_64.S:228)
>>>               arch/x86/realmode/rm/trampoline_64.o:(trampoline_start64)
make[7]: *** [arch/x86/realmode/rm/Makefile:49: arch/x86/realmode/rm/realmode.elf] Error 1
make[7]: Target 'arch/x86/realmode/rm/realmode.bin' not remade because of errors.
make[6]: *** [arch/x86/realmode/Makefile:22: arch/x86/realmode/rm/realmode.bin] Error 2
make[6]: Target 'arch/x86/realmode/' not remade because of errors.
make[5]: *** [scripts/Makefile.build:461: arch/x86/realmode] Error 2

/tmp/entry_64_compat-eb247c.s:956:2: error: expected relocatable expression
 testl $X86_EFLAGS_NT|((1) << (18))|((1) << (8)), 144(%rsp)
 ^
make[6]: *** [scripts/Makefile.build:335: arch/x86/entry/entry_64_compat.o] Error 1

ld.lld: error: undefined symbol: EENTER
>>> referenced by arch/x86/entry/vdso/vsgx.o:(__vdso_sgx_enter_enclave)

ld.lld: error: undefined symbol: ERESUME
>>> referenced by arch/x86/entry/vdso/vsgx.o:(__vdso_sgx_enter_enclave)

ld.lld: error: undefined symbol: SGX_ENCLAVE_RUN_RESERVED_START
>>> referenced by arch/x86/entry/vdso/vsgx.o:(__vdso_sgx_enter_enclave)

ld.lld: error: undefined symbol: SGX_ENCLAVE_RUN_RESERVED_END
>>> referenced by arch/x86/entry/vdso/vsgx.o:(__vdso_sgx_enter_enclave)

ld.lld: error: undefined symbol: EEXIT
>>> referenced by arch/x86/entry/vdso/vsgx.o:(__vdso_sgx_enter_enclave)
make[7]: *** [arch/x86/entry/vdso/Makefile:39: arch/x86/entry/vdso/vdso64.so.dbg] Error 1

ld.lld: error: undefined symbol: __NR_sigreturn
>>> referenced by arch/x86/entry/vdso/vdso32/sigreturn.o:(__kernel_sigreturn)

ld.lld: error: undefined symbol: __NR_rt_sigreturn
>>> referenced by arch/x86/entry/vdso/vdso32/sigreturn.o:(__kernel_rt_sigreturn)

drivers/firmware/efi/libstub/x86-mixed.S:162:17: error: unexpected token in argument list
 movl $GDT_ENTRY(((0x0010 | 0x0080 | 0x0001 | 0x0002 | 0x0008) | 0x8000 | 0x2000), 0, 0xfffff) & 0xffffffff, (%edi,%ecx)
                ^
drivers/firmware/efi/libstub/x86-mixed.S:163:17: error: unexpected token in argument list
 movl $GDT_ENTRY(((0x0010 | 0x0080 | 0x0001 | 0x0002 | 0x0008) | 0x8000 | 0x2000), 0, 0xfffff) >> 32, 4(%edi,%ecx)
                ^
make[8]: *** [scripts/Makefile.build:335: drivers/firmware/efi/libstub/x86-mixed.o] Error 1

@jyknight
Copy link
Member

It appears that gcc only disallows '$' in identifiers in C++26 mode. Not sure how relevant that is to clang. bugzilla Or am I misreading that thread?

As far as I can tell, GCC has not disallowed '$' in identifiers at all (both by testing and by inspection of that patch).

I'm not sure it really makes sense to do this change in Clang, either.

Firstly, it seems entirely unclear if the C++ proposal really intended to require this. The proposal explicitly noted that the change would break using a UCN for dollar-sign in an identifier -- which is pretty suggestive that the authors did not expect it to forbid spelling the identifier with the actual dollar-sign character in implementations supporting that extension ("For extensions that allow, for example, $ in identifiers, no one outside of compiler test suites, is likely to use a UCN to spell that.")

Also, I note that C was repaired to re-permit "$" as an implementation-defined identifier character extension, via "It is implementation-defined if a $ (U+0024, DOLLAR SIGN) may be used as a nondigit character." from https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3145.pdf -- that document seems to indicate that they felt the removal of the permission for this implementation-extension was an unintended defect in the original change.

So, I wonder if this may be C++ defect? That C++ should be re-harmonized with C, and permit implementations to allow $ as an identifier character. And, thus, that we should await such a DR fix, and make no change to Clang's behavior.

@efriedma-quic
Copy link
Collaborator

Effectively checking for LangOpts.DollarIdents || LangOpts.AsmPreprocessor? Or do you think DollarIdents should be true if AsmPreprocessor is true?

Neither. DollarIdents is already off by default in AsmPreprocessor mode. We should just preserve the existing behavior, and not warn.


(I'm not I really understand all the standards-committee stuff, but it seems dubious to have the semantics of $ in preprocessor macros implementation-defined.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepts-invalid c23 c++23 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category clang-format rejects-valid
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants