Skip to content

Commit 8aab205

Browse files
committed
Add clang atomic control options and attribute
Add option and statement attribute for controlling emitting of target-specific metadata to atomicrmw instructions in IR. The RFC for this attribute and option is https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641, Originally a pragma was proposed, then it was changed to clang attribute. This attribute allows users to specify one, two, or all three options and must be applied to a compound statement. The attribute can also be nested, with inner attributes overriding the options specified by outer attributes or the target's default options. These options will then determine the target-specific metadata added to atomic instructions in the IR. In addition to the attribute, three new compiler options are introduced: `-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`, `-f[no-]atomic-ignore-denormal-mode`. These compiler options allow users to override the default options through the Clang driver and front end. `-m[no-]unsafe-fp-atomics` is alised to `-f[no-]ignore-denormal-mode`. In terms of implementation, the atomic attribute is represented in the AST by the existing AttributedStmt, with minimal changes to AST and Sema. During code generation in Clang, the CodeGenModule maintains the current atomic options, which are used to emit the relevant metadata for atomic instructions. RAII is used to manage the saving and restoring of atomic options when entering and exiting nested AttributedStmt.
1 parent 8b1d384 commit 8aab205

29 files changed

+1416
-410
lines changed

clang/docs/LanguageExtensions.rst

+149
Original file line numberDiff line numberDiff line change
@@ -5442,6 +5442,155 @@ third argument, can only occur at file scope.
54425442
a = b[i] * c[i] + e;
54435443
}
54445444
5445+
Extensions for controlling atomic code generation
5446+
=================================================
5447+
5448+
The ``[[clang::atomic]]`` statement attribute enables users to control how
5449+
atomic operations are lowered in LLVM IR by conveying additional metadata to
5450+
the backend. The primary goal is to allow users to specify certain options,
5451+
like whether atomic operations may be performed on specific types of memory or
5452+
whether to ignore denormal mode correctness in floating-point operations,
5453+
without affecting the correctness of code that does not rely on these behaviors.
5454+
5455+
In LLVM, lowering of atomic operations (e.g., ``atomicrmw``) can differ based
5456+
on the target's capabilities. Some backends support native atomic instructions
5457+
only for certain operation types or alignments, or only in specific memory
5458+
regions. Likewise, floating-point atomic instructions may or may not respect
5459+
IEEE denormal requirements. When the user is unconcerned about denormal-mode
5460+
compliance (for performance reasons) or knows that certain atomic operations
5461+
will not be performed on a particular type of memory, extra hints are needed to
5462+
tell the backend how to proceed.
5463+
5464+
A classic example is an architecture where floating-point atomic add does not
5465+
fully conform to IEEE denormal-mode handling. If the user does not mind ignoring
5466+
that aspect, they would prefer to still emit a faster hardware atomic instruction,
5467+
rather than a fallback or CAS loop. Conversely, on certain GPUs (e.g., AMDGPU),
5468+
memory accessed via PCIe may only support a subset of atomic operations. To ensure
5469+
correct and efficient lowering, the compiler must know whether the user wants to
5470+
allow atomic operations on that type of memory.
5471+
5472+
The allowed atomic attribute values are now ``remote_memory``, ``fine_grained_memory``,
5473+
and ``ignore_denormal_mode``, each optionally prefixed with ``no_``. The meanings
5474+
are as follows:
5475+
5476+
- ``remote_memory`` means atomic operations may be performed on remote memory.
5477+
Prefixing with ``no_`` (i.e. ``no_remote_memory``) indicates that atomic
5478+
operations should not be performed on remote memory.
5479+
- ``fine_grained_memory`` means atomic operations may be performed on fine-grained
5480+
memory. Prefixing with ``no_`` (i.e. ``no_fine_grained_memory``) indicates that
5481+
atomic operations should not be performed on fine-grained memory.
5482+
- ``ignore_denormal_mode`` means that atomic operations are allowed to ignore
5483+
correctness for denormal mode in floating-point operations, potentially improving
5484+
performance on architectures that handle denormals inefficiently. The negated form,
5485+
if specified as ``no_ignore_denormal_mode``, would enforce strict denormal mode
5486+
correctness.
5487+
5488+
Within the same atomic attribute, duplicate and conflict values are accepted and the
5489+
last value of conflicting values wins. Multiple atomic attributes are allowed
5490+
for the same compound statement and the last atomic attribute wins.
5491+
5492+
Without any atomic metadata, LLVM IR defaults to conservative settings for
5493+
correctness: atomic operations are assumed to use remote memory, fine-grained
5494+
memory, and enforce denormal mode correctness (i.e. the equivalent of
5495+
``remote_memory``, ``fine_grained_memory``, and ``no_ignore_denormal_mode``).
5496+
5497+
The attribute may be applied only to a compound statement and looks like:
5498+
5499+
.. code-block:: c++
5500+
5501+
[[clang::atomic(remote_memory, fine_grained_memory, ignore_denormal_mode)]]
5502+
{
5503+
// Atomic instructions in this block carry extra metadata reflecting
5504+
// these user-specified options.
5505+
}
5506+
5507+
You can provide one or more of these options, each optionally prefixed with
5508+
``no_`` to negate that option. Any unspecified option is inherited from the
5509+
global defaults, which can be set by a compiler flag or the target's built-in defaults.
5510+
5511+
A new compiler option now globally sets the defaults for these atomic-lowering
5512+
options. The command-line format has changed to:
5513+
5514+
.. code-block:: console
5515+
5516+
$ clang -fatomic-remote-memory -fno-atomic-fine-grained-memory -fatomic-ignore-denormal-mode file.cpp
5517+
5518+
Each option has a corresponding flag:
5519+
``-fatomic-remote-memory`` / ``-fno-atomic-remote-memory``,
5520+
``-fatomic-fine-grained-memory`` / ``-fno-atomic-fine-grained-memory``,
5521+
and ``-fatomic-ignore-denormal-mode`` / ``-fno-atomic-ignore-denormal-mode``.
5522+
5523+
Code using the ``[[clang::atomic]]`` attribute can then selectively override
5524+
the command-line defaults on a per-block basis. For instance:
5525+
5526+
.. code-block:: c++
5527+
5528+
// Suppose the global defaults assume:
5529+
// remote_memory, fine_grained_memory, and no_ignore_denormal_mode
5530+
// (for conservative correctness)
5531+
5532+
void example() {
5533+
// Locally override the settings: disable remote_memory and enable
5534+
// fine_grained_memory.
5535+
[[clang::atomic(no_remote_memory, fine_grained_memory)]]
5536+
{
5537+
// In this block:
5538+
// - Atomic operations are not performed on remote memory.
5539+
// - Atomic operations are performed on fine-grained memory.
5540+
// - The setting for denormal mode remains as the global default
5541+
// (typically no_ignore_denormal_mode, enforcing strict denormal mode correctness).
5542+
// ...
5543+
}
5544+
}
5545+
5546+
Function bodies are not compound statements, so this will not work:
5547+
5548+
.. code-block:: c++
5549+
5550+
void func() [[clang::atomic(remote_memory)]] { // Wrong: applies to function type
5551+
}
5552+
5553+
Use the attribute on a compound statement within the function:
5554+
5555+
.. code-block:: c++
5556+
5557+
void func() {
5558+
[[clang::atomic(remote_memory)]]
5559+
{
5560+
// Atomic operations in this block carry the specified metadata.
5561+
}
5562+
}
5563+
5564+
The ``[[clang::atomic]]`` attribute affects only the code generation of atomic
5565+
instructions within the annotated compound statement. Clang attaches target-specific
5566+
metadata to those atomic instructions in the emitted LLVM IR to guide backend lowering.
5567+
This metadata is fixed at the Clang code generation phase and is not modified by later
5568+
LLVM passes (such as function inlining).
5569+
5570+
For example, consider:
5571+
5572+
.. code-block:: cpp
5573+
5574+
inline void func() {
5575+
[[clang::atomic(remote_memory)]]
5576+
{
5577+
// Atomic instructions lowered with metadata.
5578+
}
5579+
}
5580+
5581+
void foo() {
5582+
[[clang::atomic(no_remote_memory)]]
5583+
{
5584+
func(); // Inlined by LLVM, but the metadata from 'func()' remains unchanged.
5585+
}
5586+
}
5587+
5588+
Although current usage focuses on AMDGPU, the mechanism is general. Other
5589+
backends can ignore or implement their own responses to these flags if desired.
5590+
If a target does not understand or enforce these hints, the IR remains valid,
5591+
and the resulting program is still correct (although potentially less optimized
5592+
for that user's needs).
5593+
54455594
Specifying an attribute for multiple declarations (#pragma clang attribute)
54465595
===========================================================================
54475596

clang/docs/ReleaseNotes.rst

+6
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,12 @@ Attribute Changes in Clang
132132
This forces the global to be considered small or large in regards to the
133133
x86-64 code model, regardless of the code model specified for the compilation.
134134

135+
- Introduced a new statement attribute ``[[clang::atomic]]`` that enables
136+
fine-grained control over atomic code generation on a per-statement basis.
137+
Supported options include ``[no_]remote_memory``,
138+
``[no_]fine_grained_memory``, and ``[no_]ignore_denormal_mode``, particularly
139+
relevant for AMDGPU targets, where they map to corresponding IR metadata.
140+
135141
Improvements to Clang's diagnostics
136142
-----------------------------------
137143

clang/include/clang/Basic/Attr.td

+15
Original file line numberDiff line numberDiff line change
@@ -4991,3 +4991,18 @@ def NoTrivialAutoVarInit: InheritableAttr {
49914991
let Documentation = [NoTrivialAutoVarInitDocs];
49924992
let SimpleHandler = 1;
49934993
}
4994+
4995+
def Atomic : StmtAttr {
4996+
let Spellings = [Clang<"atomic">];
4997+
let Args = [VariadicEnumArgument<"AtomicOptions", "ConsumedOption",
4998+
/*is_string=*/false,
4999+
["remote_memory", "no_remote_memory",
5000+
"fine_grained_memory", "no_fine_grained_memory",
5001+
"ignore_denormal_mode", "no_ignore_denormal_mode"],
5002+
["remote_memory", "no_remote_memory",
5003+
"fine_grained_memory", "no_fine_grained_memory",
5004+
"ignore_denormal_mode", "no_ignore_denormal_mode"]>];
5005+
let Subjects = SubjectList<[CompoundStmt], ErrorDiag, "compound statements">;
5006+
let Documentation = [AtomicDocs];
5007+
let StrictEnumParameters = 1;
5008+
}

clang/include/clang/Basic/AttrDocs.td

+15
Original file line numberDiff line numberDiff line change
@@ -8079,6 +8079,21 @@ for details.
80798079
}];
80808080
}
80818081

8082+
def AtomicDocs : Documentation {
8083+
let Category = DocCatStmt;
8084+
let Content = [{
8085+
The ``atomic`` attribute can be applied to *compound statements* to override or
8086+
further specify the default atomic code-generation behavior, especially on
8087+
targets such as AMDGPU. You can annotate compound statements with options
8088+
to modify how atomic instructions inside that statement are emitted at the IR
8089+
level.
8090+
8091+
For details, see the documentation for `@atomic
8092+
<http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-controlling-atomic-code-generation>`_
8093+
8094+
}];
8095+
}
8096+
80828097
def ClangRandomizeLayoutDocs : Documentation {
80838098
let Category = DocCatDecl;
80848099
let Heading = "randomize_layout, no_randomize_layout";

clang/include/clang/Basic/DiagnosticSemaKinds.td

+2
Original file line numberDiff line numberDiff line change
@@ -3286,6 +3286,8 @@ def err_invalid_branch_protection_spec : Error<
32863286
"invalid or misplaced branch protection specification '%0'">;
32873287
def warn_unsupported_branch_protection_spec : Warning<
32883288
"unsupported branch protection specification '%0'">, InGroup<BranchProtection>;
3289+
def err_attribute_invalid_atomic_argument : Error<
3290+
"invalid argument '%0' to atomic attribute; valid options are: 'remote_memory', 'fine_grained_memory', 'ignore_denormal_mode' (optionally prefixed with 'no_')">;
32893291

32903292
def warn_unsupported_target_attribute
32913293
: Warning<"%select{unsupported|duplicate|unknown}0%select{| CPU|"

clang/include/clang/Basic/Features.def

+2
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,8 @@ EXTENSION(datasizeof, LangOpts.CPlusPlus)
313313

314314
FEATURE(cxx_abi_relative_vtable, LangOpts.CPlusPlus && LangOpts.RelativeCXXABIVTables)
315315

316+
FEATURE(clang_atomic_attributes, true)
317+
316318
// CUDA/HIP Features
317319
FEATURE(cuda_noinline_keyword, LangOpts.CUDA)
318320
EXTENSION(cuda_implicit_host_device_templates, LangOpts.CUDA && LangOpts.OffloadImplicitHostDeviceTemplates)

clang/include/clang/Basic/LangOptions.h

+66
Original file line numberDiff line numberDiff line change
@@ -630,6 +630,12 @@ class LangOptions : public LangOptionsBase {
630630
// WebAssembly target.
631631
bool NoWasmOpt = false;
632632

633+
/// Atomic code-generation options.
634+
/// These flags are set directly from the command-line options.
635+
bool AtomicRemoteMemory = false;
636+
bool AtomicFineGrainedMemory = false;
637+
bool AtomicIgnoreDenormalMode = false;
638+
633639
LangOptions();
634640

635641
/// Set language defaults for the given input language and
@@ -1107,6 +1113,66 @@ inline void FPOptions::applyChanges(FPOptionsOverride FPO) {
11071113
*this = FPO.applyOverrides(*this);
11081114
}
11091115

1116+
// The three atomic code-generation options.
1117+
// The canonical (positive) names are:
1118+
// "remote_memory", "fine_grained_memory", and "ignore_denormal_mode".
1119+
// In attribute or command-line parsing, a token prefixed with "no_" inverts its
1120+
// value.
1121+
enum class AtomicOptionKind {
1122+
RemoteMemory, // true means remote memory is enabled.
1123+
FineGrainedMemory, // true means fine-grained memory is enabled.
1124+
IgnoreDenormalMode, // true means ignore floating-point denormals.
1125+
LANGOPT_ATOMIC_OPTION_LAST = IgnoreDenormalMode,
1126+
};
1127+
1128+
struct AtomicOptions {
1129+
// Bitfields for each option.
1130+
unsigned remote_memory : 1;
1131+
unsigned fine_grained_memory : 1;
1132+
unsigned ignore_denormal_mode : 1;
1133+
1134+
AtomicOptions()
1135+
: remote_memory(0), fine_grained_memory(0), ignore_denormal_mode(0) {}
1136+
1137+
AtomicOptions(const LangOptions &LO)
1138+
: remote_memory(LO.AtomicRemoteMemory),
1139+
fine_grained_memory(LO.AtomicFineGrainedMemory),
1140+
ignore_denormal_mode(LO.AtomicIgnoreDenormalMode) {}
1141+
1142+
bool getOption(AtomicOptionKind Kind) const {
1143+
switch (Kind) {
1144+
case AtomicOptionKind::RemoteMemory:
1145+
return remote_memory;
1146+
case AtomicOptionKind::FineGrainedMemory:
1147+
return fine_grained_memory;
1148+
case AtomicOptionKind::IgnoreDenormalMode:
1149+
return ignore_denormal_mode;
1150+
}
1151+
llvm_unreachable("Invalid AtomicOptionKind");
1152+
}
1153+
1154+
void setOption(AtomicOptionKind Kind, bool Value) {
1155+
switch (Kind) {
1156+
case AtomicOptionKind::RemoteMemory:
1157+
remote_memory = Value;
1158+
return;
1159+
case AtomicOptionKind::FineGrainedMemory:
1160+
fine_grained_memory = Value;
1161+
return;
1162+
case AtomicOptionKind::IgnoreDenormalMode:
1163+
ignore_denormal_mode = Value;
1164+
return;
1165+
}
1166+
llvm_unreachable("Invalid AtomicOptionKind");
1167+
}
1168+
1169+
LLVM_DUMP_METHOD void dump() const {
1170+
llvm::errs() << "\n remote_memory: " << remote_memory
1171+
<< "\n fine_grained_memory: " << fine_grained_memory
1172+
<< "\n ignore_denormal_mode: " << ignore_denormal_mode << "\n";
1173+
}
1174+
};
1175+
11101176
/// Describes the kind of translation unit being processed.
11111177
enum TranslationUnitKind {
11121178
/// The translation unit is a complete translation unit.

clang/include/clang/Basic/TargetInfo.h

+6-4
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,9 @@ class TargetInfo : public TransferrableTargetInfo,
301301
// in function attributes in IR.
302302
llvm::StringSet<> ReadOnlyFeatures;
303303

304+
// Default atomic options
305+
AtomicOptions AtomicOpts;
306+
304307
public:
305308
/// Construct a target for the given options.
306309
///
@@ -1060,10 +1063,6 @@ class TargetInfo : public TransferrableTargetInfo,
10601063
/// available on this target.
10611064
bool hasRISCVVTypes() const { return HasRISCVVTypes; }
10621065

1063-
/// Returns whether or not the AMDGPU unsafe floating point atomics are
1064-
/// allowed.
1065-
bool allowAMDGPUUnsafeFPAtomics() const { return AllowAMDGPUUnsafeFPAtomics; }
1066-
10671066
/// For ARM targets returns a mask defining which coprocessors are configured
10681067
/// as Custom Datapath.
10691068
uint32_t getARMCDECoprocMask() const { return ARMCDECoprocMask; }
@@ -1699,6 +1698,9 @@ class TargetInfo : public TransferrableTargetInfo,
16991698
return CC_C;
17001699
}
17011700

1701+
/// Get the default atomic options.
1702+
AtomicOptions getAtomicOpts() const { return AtomicOpts; }
1703+
17021704
enum CallingConvCheckResult {
17031705
CCCR_OK,
17041706
CCCR_Warning,

clang/include/clang/Basic/TargetOptions.h

-3
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,6 @@ class TargetOptions {
7575
/// address space.
7676
bool NVPTXUseShortPointers = false;
7777

78-
/// \brief If enabled, allow AMDGPU unsafe floating point atomics.
79-
bool AllowAMDGPUUnsafeFPAtomics = false;
80-
8178
/// \brief Code object version for AMDGPU.
8279
llvm::CodeObjectVersionKind CodeObjectVersion =
8380
llvm::CodeObjectVersionKind::COV_None;

clang/include/clang/Driver/Options.td

+22-8
Original file line numberDiff line numberDiff line change
@@ -2278,6 +2278,24 @@ def fsymbol_partition_EQ : Joined<["-"], "fsymbol-partition=">, Group<f_Group>,
22782278
Visibility<[ClangOption, CC1Option]>,
22792279
MarshallingInfoString<CodeGenOpts<"SymbolPartition">>;
22802280

2281+
defm atomic_remote_memory : BoolFOption<"atomic-remote-memory",
2282+
LangOpts<"AtomicRemoteMemory">, DefaultFalse,
2283+
PosFlag<SetTrue, [], [ClangOption, CC1Option], "May have">,
2284+
NegFlag<SetFalse, [], [ClangOption], "Assume no">,
2285+
BothFlags<[], [ClangOption], " atomic operations on remote memory">>;
2286+
2287+
defm atomic_fine_grained_memory : BoolFOption<"atomic-fine-grained-memory",
2288+
LangOpts<"AtomicFineGrainedMemory">, DefaultFalse,
2289+
PosFlag<SetTrue, [], [ClangOption, CC1Option], "May have">,
2290+
NegFlag<SetFalse, [], [ClangOption], "Assume no">,
2291+
BothFlags<[], [ClangOption], " atomic operations on fine-grained memory">>;
2292+
2293+
defm atomic_ignore_denormal_mode : BoolFOption<"atomic-ignore-denormal-mode",
2294+
LangOpts<"AtomicIgnoreDenormalMode">, DefaultFalse,
2295+
PosFlag<SetTrue, [], [ClangOption, CC1Option], "Allow">,
2296+
NegFlag<SetFalse, [], [ClangOption], "Disallow">,
2297+
BothFlags<[], [ClangOption], " atomic operations to ignore denormal mode">>;
2298+
22812299
defm memory_profile : OptInCC1FFlag<"memory-profile", "Enable", "Disable", " heap memory profiling">;
22822300
def fmemory_profile_EQ : Joined<["-"], "fmemory-profile=">,
22832301
Group<f_Group>, Visibility<[ClangOption, CC1Option]>,
@@ -5154,14 +5172,10 @@ defm amdgpu_precise_memory_op
51545172
: SimpleMFlag<"amdgpu-precise-memory-op", "Enable", "Disable",
51555173
" precise memory mode (AMDGPU only)">;
51565174

5157-
defm unsafe_fp_atomics : BoolMOption<"unsafe-fp-atomics",
5158-
TargetOpts<"AllowAMDGPUUnsafeFPAtomics">, DefaultFalse,
5159-
PosFlag<SetTrue, [], [ClangOption, CC1Option],
5160-
"Enable generation of unsafe floating point "
5161-
"atomic instructions. May generate more efficient code, but may not "
5162-
"respect rounding and denormal modes, and may give incorrect results "
5163-
"for certain memory destinations. (AMDGPU only)">,
5164-
NegFlag<SetFalse>>;
5175+
def munsafe_fp_atomics : Flag<["-"], "munsafe-fp-atomics">,
5176+
Visibility<[ClangOption, CC1Option]>, Alias<fatomic_ignore_denormal_mode>;
5177+
def mno_unsafe_fp_atomics : Flag<["-"], "mno-unsafe-fp-atomics">,
5178+
Visibility<[ClangOption]>, Alias<fno_atomic_ignore_denormal_mode>;
51655179

51665180
def faltivec : Flag<["-"], "faltivec">, Group<f_Group>;
51675181
def fno_altivec : Flag<["-"], "fno-altivec">, Group<f_Group>;

0 commit comments

Comments
 (0)