@@ -5442,6 +5442,155 @@ third argument, can only occur at file scope.
5442
5442
a = b[i] * c[i] + e;
5443
5443
}
5444
5444
5445
+ Extensions for controlling atomic code generation
5446
+ =================================================
5447
+
5448
+ The ``[[clang::atomic]] `` statement attribute enables users to control how
5449
+ atomic operations are lowered in LLVM IR by conveying additional metadata to
5450
+ the backend. The primary goal is to allow users to specify certain options,
5451
+ like whether atomic operations may be performed on specific types of memory or
5452
+ whether to ignore denormal mode correctness in floating-point operations,
5453
+ without affecting the correctness of code that does not rely on these behaviors.
5454
+
5455
+ In LLVM, lowering of atomic operations (e.g ., ``atomicrmw ``) can differ based
5456
+ on the target's capabilities. Some backends support native atomic instructions
5457
+ only for certain operation types or alignments, or only in specific memory
5458
+ regions. Likewise, floating-point atomic instructions may or may not respect
5459
+ IEEE denormal requirements. When the user is unconcerned about denormal-mode
5460
+ compliance (for performance reasons) or knows that certain atomic operations
5461
+ will not be performed on a particular type of memory, extra hints are needed to
5462
+ tell the backend how to proceed.
5463
+
5464
+ A classic example is an architecture where floating-point atomic add does not
5465
+ fully conform to IEEE denormal-mode handling. If the user does not mind ignoring
5466
+ that aspect, they would prefer to still emit a faster hardware atomic instruction,
5467
+ rather than a fallback or CAS loop. Conversely, on certain GPUs (e.g ., AMDGPU),
5468
+ memory accessed via PCIe may only support a subset of atomic operations. To ensure
5469
+ correct and efficient lowering, the compiler must know whether the user wants to
5470
+ allow atomic operations on that type of memory.
5471
+
5472
+ The allowed atomic attribute values are now ``remote_memory ``, ``fine_grained_memory ``,
5473
+ and ``ignore_denormal_mode ``, each optionally prefixed with ``no_ ``. The meanings
5474
+ are as follows:
5475
+
5476
+ - ``remote_memory `` means atomic operations may be performed on remote memory.
5477
+ Prefixing with ``no_ `` (i.e . ``no_remote_memory ``) indicates that atomic
5478
+ operations should not be performed on remote memory.
5479
+ - ``fine_grained_memory `` means atomic operations may be performed on fine-grained
5480
+ memory. Prefixing with ``no_ `` (i.e . ``no_fine_grained_memory ``) indicates that
5481
+ atomic operations should not be performed on fine-grained memory.
5482
+ - ``ignore_denormal_mode `` means that atomic operations are allowed to ignore
5483
+ correctness for denormal mode in floating-point operations, potentially improving
5484
+ performance on architectures that handle denormals inefficiently. The negated form,
5485
+ if specified as ``no_ignore_denormal_mode ``, would enforce strict denormal mode
5486
+ correctness.
5487
+
5488
+ Within the same atomic attribute, duplicate and conflict values are accepted and the
5489
+ last value of conflicting values wins. Multiple atomic attributes are allowed
5490
+ for the same compound statement and the last atomic attribute wins.
5491
+
5492
+ Without any atomic metadata, LLVM IR defaults to conservative settings for
5493
+ correctness: atomic operations are assumed to use remote memory, fine-grained
5494
+ memory, and enforce denormal mode correctness (i.e . the equivalent of
5495
+ ``remote_memory ``, ``fine_grained_memory ``, and ``no_ignore_denormal_mode ``).
5496
+
5497
+ The attribute may be applied only to a compound statement and looks like:
5498
+
5499
+ .. code-block :: c++
5500
+
5501
+ [[clang::atomic (remote_memory, fine_grained_memory, ignore_denormal_mode)]]
5502
+ {
5503
+ // Atomic instructions in this block carry extra metadata reflecting
5504
+ // these user-specified options.
5505
+ }
5506
+
5507
+ You can provide one or more of these options, each optionally prefixed with
5508
+ ``no_ `` to negate that option. Any unspecified option is inherited from the
5509
+ global defaults, which can be set by a compiler flag or the target's built-in defaults.
5510
+
5511
+ A new compiler option now globally sets the defaults for these atomic-lowering
5512
+ options. The command-line format has changed to:
5513
+
5514
+ .. code-block :: console
5515
+
5516
+ $ clang -fatomic-remote-memory -fno-atomic-fine-grained-memory -fatomic-ignore-denormal-mode file.cpp
5517
+
5518
+ Each option has a corresponding flag:
5519
+ ``-fatomic-remote-memory `` / ``-fno-atomic-remote-memory ``,
5520
+ ``-fatomic-fine-grained-memory `` / ``-fno-atomic-fine-grained-memory ``,
5521
+ and ``-fatomic-ignore-denormal-mode `` / ``-fno-atomic-ignore-denormal-mode ``.
5522
+
5523
+ Code using the ``[[clang::atomic]] `` attribute can then selectively override
5524
+ the command-line defaults on a per-block basis. For instance:
5525
+
5526
+ .. code-block :: c++
5527
+
5528
+ // Suppose the global defaults assume:
5529
+ // remote_memory, fine_grained_memory, and no_ignore_denormal_mode
5530
+ // (for conservative correctness)
5531
+
5532
+ void example () {
5533
+ // Locally override the settings: disable remote_memory and enable
5534
+ // fine_grained_memory.
5535
+ [[clang::atomic (no_remote_memory, fine_grained_memory)]]
5536
+ {
5537
+ // In this block:
5538
+ // - Atomic operations are not performed on remote memory.
5539
+ // - Atomic operations are performed on fine-grained memory.
5540
+ // - The setting for denormal mode remains as the global default
5541
+ // (typically no_ignore_denormal_mode, enforcing strict denormal mode correctness).
5542
+ // ...
5543
+ }
5544
+ }
5545
+
5546
+ Function bodies are not compound statements, so this will not work:
5547
+
5548
+ .. code-block :: c++
5549
+
5550
+ void func () [[clang::atomic (remote_memory)]] { // Wrong: applies to function type
5551
+ }
5552
+
5553
+ Use the attribute on a compound statement within the function:
5554
+
5555
+ .. code-block :: c++
5556
+
5557
+ void func () {
5558
+ [[clang::atomic (remote_memory)]]
5559
+ {
5560
+ // Atomic operations in this block carry the specified metadata.
5561
+ }
5562
+ }
5563
+
5564
+ The ``[[clang::atomic]] `` attribute affects only the code generation of atomic
5565
+ instructions within the annotated compound statement. Clang attaches target-specific
5566
+ metadata to those atomic instructions in the emitted LLVM IR to guide backend lowering.
5567
+ This metadata is fixed at the Clang code generation phase and is not modified by later
5568
+ LLVM passes (such as function inlining).
5569
+
5570
+ For example, consider:
5571
+
5572
+ .. code-block :: cpp
5573
+
5574
+ inline void func() {
5575
+ [[clang::atomic(remote_memory)]]
5576
+ {
5577
+ // Atomic instructions lowered with metadata.
5578
+ }
5579
+ }
5580
+
5581
+ void foo() {
5582
+ [[clang::atomic(no_remote_memory)]]
5583
+ {
5584
+ func(); // Inlined by LLVM, but the metadata from 'func()' remains unchanged.
5585
+ }
5586
+ }
5587
+
5588
+ Although current usage focuses on AMDGPU, the mechanism is general. Other
5589
+ backends can ignore or implement their own responses to these flags if desired.
5590
+ If a target does not understand or enforce these hints, the IR remains valid,
5591
+ and the resulting program is still correct (although potentially less optimized
5592
+ for that user's needs).
5593
+
5445
5594
Specifying an attribute for multiple declarations (#pragma clang attribute)
5446
5595
===========================================================================
5447
5596
0 commit comments