Skip to content

[AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates #134016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 48 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
91eeaf0
Add the functional identity and feature queries.
AlexVlx Apr 2, 2025
8bf1168
Fix format.
AlexVlx Apr 2, 2025
3421292
Fix broken patch merge.
AlexVlx Apr 2, 2025
539c7e6
Add release notes.
AlexVlx Apr 2, 2025
5926b9f
(Hopefully) Final format fix.
AlexVlx Apr 2, 2025
4381d93
Remove stray space.
AlexVlx Apr 2, 2025
d18f64e
Remove unused header, fix borked test.
AlexVlx Apr 2, 2025
7880ff4
Stars everywhere.
AlexVlx Apr 2, 2025
719dfde
Fix format without line break.
AlexVlx Apr 2, 2025
36b69b4
Add host tests.
AlexVlx Apr 2, 2025
e327e15
Fit code examples within 80-char limit.
AlexVlx Apr 2, 2025
d35efc5
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Apr 14, 2025
5dee670
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Apr 15, 2025
888a080
Fix tests.
AlexVlx Apr 16, 2025
e35ac62
Fix test.
AlexVlx Apr 16, 2025
6c41ed2
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Apr 16, 2025
8683148
Merge branch 'zcfs' of https://github.com/AlexVlx/llvm-project; branc…
AlexVlx Apr 22, 2025
a9b3e85
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Apr 24, 2025
468a517
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Apr 29, 2025
1b8b57e
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 2, 2025
18b4af2
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 5, 2025
a8bca2f
Re-work implementation to return a target specific type.
AlexVlx May 6, 2025
716cc1f
Fix formatting.
AlexVlx May 6, 2025
79035a9
Delete spurious whitespace.
AlexVlx May 6, 2025
6945c2e
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 7, 2025
9a7e250
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 7, 2025
0f04dbc
Handle jumps into controlled sequences.
AlexVlx May 7, 2025
39a9d55
Fix formatting.
AlexVlx May 7, 2025
3fe116e
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 8, 2025
49c862a
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 12, 2025
f293f39
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 16, 2025
251476d
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 16, 2025
ebde49b
Start incorporating review feedback.
AlexVlx May 16, 2025
4bdd30e
Less `auto`.
AlexVlx May 16, 2025
a1b4a11
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 20, 2025
18841c1
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 22, 2025
76848d5
Print out valid AMDGCN processor identifiers.
AlexVlx May 22, 2025
e1bfdf3
Use boolean type for the predicate, even though it should never get e…
AlexVlx May 22, 2025
4f65468
Register pass early.
AlexVlx May 22, 2025
e940d42
Clarify builtins are also available in C.
AlexVlx May 22, 2025
46adb74
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx May 26, 2025
ca9521d
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Jun 2, 2025
11dd570
Try to fix potentially erroneous indentation in note.
AlexVlx Jun 2, 2025
fab0d14
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Jun 2, 2025
611ec0d
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Jun 2, 2025
03b029f
Add test for returning a predicate.
AlexVlx Jun 2, 2025
32be1c0
Merge branch 'main' of https://github.com/llvm/llvm-project into zcfs
AlexVlx Jun 2, 2025
012f74d
Fix formatting.
AlexVlx Jun 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions clang/docs/LanguageExtensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4966,6 +4966,118 @@ If no address spaces names are provided, all address spaces are fenced.
__builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local")
__builtin_amdgcn_fence(__ATOMIC_SEQ_CST, "workgroup", "local", "global")

__builtin_amdgcn_processor_is and __builtin_amdgcn_is_invocable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

``__builtin_amdgcn_processor_is`` and ``__builtin_amdgcn_is_invocable`` provide
a functional mechanism for programatically querying:

* the identity of the current target processor;
* the capability of the current target processor to invoke a particular builtin.

**Syntax**:

.. code-block:: c

__amdgpu_feature_predicate_t __builtin_amdgcn_processor_is(const char*);
__amdgpu_feature_predicate_t __builtin_amdgcn_is_invocable(builtin_name);

**Example of use**:

.. code-block:: c++

if (__builtin_amdgcn_processor_is("gfx1201") ||
__builtin_amdgcn_is_invocable(__builtin_amdgcn_s_sleep_var))
__builtin_amdgcn_s_sleep_var(x);

if (!__builtin_amdgcn_processor_is("gfx906"))
__builtin_amdgcn_s_wait_event_export_ready();
else if (__builtin_amdgcn_processor_is("gfx1010") ||
__builtin_amdgcn_processor_is("gfx1101"))
__builtin_amdgcn_s_ttracedata_imm(1);

while (__builtin_amdgcn_processor_is("gfx1101")) *p += x;

do {
break;
} while (__builtin_amdgcn_processor_is("gfx1010"));

for (; __builtin_amdgcn_processor_is("gfx1201"); ++*p) break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about inf. loop here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this'd be an infinite loop ever, it's either 0 trips or one trip, if the predicate is true we just break. Otherwise, the discussion from the previous inf loop applies.


if (__builtin_amdgcn_is_invocable(__builtin_amdgcn_s_wait_event_export_ready))
__builtin_amdgcn_s_wait_event_export_ready();
else if (__builtin_amdgcn_is_invocable(__builtin_amdgcn_s_ttracedata_imm))
__builtin_amdgcn_s_ttracedata_imm(1);

do {
break;
} while (
__builtin_amdgcn_is_invocable(__builtin_amdgcn_global_load_tr_b64_i32));

for (; __builtin_amdgcn_is_invocable(__builtin_amdgcn_permlane64); ++*p)
break;

**Description**:

The builtins return a value of type ``__amdgpu_feature_predicate_t``, which is a
target specific type that behaves as if its C++ definition was the following:
Comment on lines +5022 to +5023
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this builtin work in C? If so, the docs should be updated to make it clear that this behavior applies to C as well as C++ and explain what it means in a bit more detail (presume that C users have no idea how C++ idioms work).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does work in C, although I'm not convinced it will see significant use. That being said, there's no good reason to make it C++ only. I have tried to add an explanation for that context / provide a fleshed out example, please do let me know if it's more or less aligned with what you had in mind. Thanks!


.. code-block:: c++

struct __amdgpu_feature_predicate_t {
__amdgpu_feature_predicate_t() = delete;
__amdgpu_feature_predicate_t(const __amdgpu_feature_predicate_t&) = delete;
__amdgpu_feature_predicate_t(__amdgpu_feature_predicate_t&&) = delete;

explicit
operator bool() const noexcept;
};

The builtins can be used in C as well, wherein the
``__amdgpu_feature_predicate_t`` type behaves as an opaque, forward declared
type with conditional automated conversion to ``_Bool`` when used as the
predicate argument to a control structure:

.. code-block:: c

struct __amdgpu_feature_predicate_t ret(); // Error
void arg(struct __amdgpu_feature_predicate_t); // Error
void local() {
struct __amdgpu_feature_predicate_t x; // Error
struct __amdgpu_feature_predicate_t y =
__builtin_amdgcn_processor_is("gfx900"); // Error
}
void valid_use() {
_Bool x = (_Bool)__builtin_amdgcn_processor_is("gfx900"); // OK
if (__builtin_amdgcn_processor_is("gfx900")) // Implicit cast to _Bool
return;
for (; __builtin_amdgcn_processor_is("gfx900");) // Implicit cast to _Bool
break;
while (__builtin_amdgcn_processor_is("gfx900")) // Implicit cast to _Bool
break;
do {
break;
} while (__builtin_amdgcn_processor_is("gfx900")); // Implicit cast to _Bool

__builtin_amdgcn_processor_is("gfx900") ? x : !x;
}

The boolean interpretation of the predicate values returned by the builtins:

* indicates whether the current target matches the argument; the argument MUST
be a string literal and a valid AMDGPU target
* indicates whether the builtin function passed as the argument can be invoked
by the current target; the argument MUST be either a generic or AMDGPU
specific builtin name

When invoked while compiling for a concrete target, the builtins are evaluated
early by Clang, and never produce any CodeGen effects / have no observable
side-effects in IR. Conversely, when compiling for AMDGCN flavoured SPIR-v,
which is an abstract target, a series of predicate values are implicitly
created. These predicates get resolved when finalizing the compilation process
for a concrete target, and shall reflect the latter's identity and features.
Thus, it is possible to author high-level code, in e.g. HIP, that is target
adaptive in a dynamic fashion, contrary to macro based mechanisms.

ARM/AArch64 Language Extensions
-------------------------------
Expand Down
4 changes: 4 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -838,6 +838,10 @@ AMDGPU Support
^^^^^^^^^^^^^^

- Bump the default code object version to 6. ROCm 6.3 is required to run any program compiled with COV6.
- Introduced a new target specific builtin ``__builtin_amdgcn_processor_is``,
a late / deferred query for the current target processor.
- Introduced a new target specific builtin ``__builtin_amdgcn_is_invocable``,
which enables fine-grained, per-builtin, feature availability.

NVPTX Support
^^^^^^^^^^^^^^
Expand Down
8 changes: 8 additions & 0 deletions clang/include/clang/Basic/AMDGPUTypes.def
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,18 @@
AMDGPU_TYPE(Name, Id, SingletonId, Width, Align)
#endif

#ifndef AMDGPU_FEATURE_PREDICATE_TYPE
#define AMDGPU_FEATURE_PREDICATE_TYPE(Name, Id, SingletonId, Width, Align) \
AMDGPU_TYPE(Name, Id, SingletonId, Width, Align)
#endif

AMDGPU_OPAQUE_PTR_TYPE("__amdgpu_buffer_rsrc_t", AMDGPUBufferRsrc, AMDGPUBufferRsrcTy, 128, 128, 8)

AMDGPU_NAMED_BARRIER_TYPE("__amdgpu_named_workgroup_barrier_t", AMDGPUNamedWorkgroupBarrier, AMDGPUNamedWorkgroupBarrierTy, 128, 32, 0)

AMDGPU_FEATURE_PREDICATE_TYPE("__amdgpu_feature_predicate_t", AMDGPUFeaturePredicate, AMDGPUFeaturePredicateTy, 1, 1)

#undef AMDGPU_TYPE
#undef AMDGPU_OPAQUE_PTR_TYPE
#undef AMDGPU_NAMED_BARRIER_TYPE
#undef AMDGPU_FEATURE_PREDICATE_TYPE
1 change: 1 addition & 0 deletions clang/include/clang/Basic/Builtins.def
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
// Q -> target builtin type, followed by a character to distinguish the builtin type
// Qa -> AArch64 svcount_t builtin type.
// Qb -> AMDGPU __amdgpu_buffer_rsrc_t builtin type.
// Qc -> AMDGPU __amdgpu_feature_predicate_t builtin type.
// E -> ext_vector, followed by the number of elements and the base type.
// X -> _Complex, followed by the base type.
// Y -> ptrdiff_t
Expand Down
5 changes: 5 additions & 0 deletions clang/include/clang/Basic/BuiltinsAMDGPU.def
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,11 @@ BUILTIN(__builtin_amdgcn_endpgm, "v", "nr")
BUILTIN(__builtin_amdgcn_get_fpenv, "WUi", "n")
BUILTIN(__builtin_amdgcn_set_fpenv, "vWUi", "n")

// These are special FE only builtins intended for forwarding the requirements
// to the ME.
BUILTIN(__builtin_amdgcn_processor_is, "QccC*", "nctu")
BUILTIN(__builtin_amdgcn_is_invocable, "Qc", "nctu")

//===----------------------------------------------------------------------===//
// R600-NI only builtins.
//===----------------------------------------------------------------------===//
Expand Down
23 changes: 22 additions & 1 deletion clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -12476,7 +12476,7 @@ def warn_zero_as_null_pointer_constant : Warning<
InGroup<DiagGroup<"zero-as-null-pointer-constant">>, DefaultIgnore;

def warn_not_eliding_copy_on_return : Warning<
"not eliding copy on return">,
"not eliding copy on return">,
InGroup<DiagGroup<"nrvo">>, DefaultIgnore;

def err_nullability_cs_multilevel : Error<
Expand Down Expand Up @@ -13385,4 +13385,25 @@ def err_acc_device_type_multiple_archs
// AMDGCN builtins diagnostics
def err_amdgcn_load_lds_size_invalid_value : Error<"invalid size value">;
def note_amdgcn_load_lds_size_valid_value : Note<"size must be %select{1, 2, or 4|1, 2, 4, 12 or 16}0">;
def err_amdgcn_processor_is_arg_not_literal
: Error<"the argument to __builtin_amdgcn_processor_is must be a string "
"literal">;
def err_amdgcn_processor_is_arg_invalid_value
: Error<"the argument to __builtin_amdgcn_processor_is must be a valid "
"AMDGCN processor identifier; '%0' is not valid">;
def note_amdgcn_processor_is_valid_options
: Note<"valid AMDGCN processor identifiers are: %0">;
def err_amdgcn_is_invocable_arg_invalid_value
: Error<"the argument to __builtin_amdgcn_is_invocable must be either a "
"target agnostic builtin or an AMDGCN target specific builtin; `%0`"
" is not valid">;
def err_amdgcn_predicate_type_is_not_constructible
: Error<"%0 has type __amdgpu_feature_predicate_t, which is not"
" constructible">;
def err_amdgcn_predicate_type_needs_explicit_bool_cast
: Error<"%0 must be explicitly cast to %1; however, please note that this "
"is almost always an error and that it prevents the effective "
"guarding of target dependent code, and thus should be avoided">;
def note_amdgcn_protected_by_predicate : Note<"jump enters statement controlled"
" by AMDGPU feature predicate">;
} // end of sema component.
9 changes: 9 additions & 0 deletions clang/include/clang/Sema/SemaAMDGPU.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,16 @@

#include "clang/AST/ASTFwd.h"
#include "clang/Sema/SemaBase.h"
#include "llvm/ADT/SmallPtrSet.h"

namespace clang {
class AttributeCommonInfo;
class Expr;
class ParsedAttr;

class SemaAMDGPU : public SemaBase {
llvm::SmallPtrSet<Expr *, 32> ExpandedPredicates;

public:
SemaAMDGPU(Sema &S);

Expand Down Expand Up @@ -64,6 +68,11 @@ class SemaAMDGPU : public SemaBase {
void handleAMDGPUNumVGPRAttr(Decl *D, const ParsedAttr &AL);
void handleAMDGPUMaxNumWorkGroupsAttr(Decl *D, const ParsedAttr &AL);
void handleAMDGPUFlatWorkGroupSizeAttr(Decl *D, const ParsedAttr &AL);

/// Expand a valid use of the feature identification builtins into its
/// corresponding sequence of instructions.
Expr *ExpandAMDGPUPredicateBI(CallExpr *CE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Expr *ExpandAMDGPUPredicateBI(CallExpr *CE);
Expr *ExpandAMDGPUPredicateBuiltIn(CallExpr *CE);

bool IsPredicate(Expr *E) const;
};
} // namespace clang

Expand Down
11 changes: 10 additions & 1 deletion clang/lib/AST/ASTContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1476,7 +1476,12 @@ void ASTContext::InitBuiltinTypes(const TargetInfo &Target,
}

if (Target.getTriple().isAMDGPU() ||
(AuxTarget && AuxTarget->getTriple().isAMDGPU())) {
(Target.getTriple().isSPIRV() &&
Target.getTriple().getVendor() == llvm::Triple::AMD) ||
(AuxTarget &&
(AuxTarget->getTriple().isAMDGPU() ||
((AuxTarget->getTriple().isSPIRV() &&
AuxTarget->getTriple().getVendor() == llvm::Triple::AMD))))) {
#define AMDGPU_TYPE(Name, Id, SingletonId, Width, Align) \
InitBuiltinType(SingletonId, BuiltinType::Id);
#include "clang/Basic/AMDGPUTypes.def"
Expand Down Expand Up @@ -12507,6 +12512,10 @@ static QualType DecodeTypeFromStr(const char *&Str, const ASTContext &Context,
Type = Context.AMDGPUBufferRsrcTy;
break;
}
case 'c': {
Type = Context.AMDGPUFeaturePredicateTy;
break;
}
default:
llvm_unreachable("Unexpected target builtin type");
}
Expand Down
9 changes: 9 additions & 0 deletions clang/lib/Basic/Targets/SPIR.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -181,3 +181,12 @@ void SPIRV64AMDGCNTargetInfo::setAuxTarget(const TargetInfo *Aux) {
Float128Format = DoubleFormat;
}
}

bool SPIRV64AMDGCNTargetInfo::isValidCPUName(StringRef CPU) const {
return AMDGPUTI.isValidCPUName(CPU);
}

void SPIRV64AMDGCNTargetInfo::fillValidCPUList(
SmallVectorImpl<StringRef> &Values) const {
return AMDGPUTI.fillValidCPUList(Values);
}
5 changes: 5 additions & 0 deletions clang/lib/Basic/Targets/SPIR.h
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,11 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64AMDGCNTargetInfo final
}

bool hasInt128Type() const override { return TargetInfo::hasInt128Type(); }

// This is only needed for validating arguments passed to
// __builtin_amdgcn_processor_is
bool isValidCPUName(StringRef Name) const override;
void fillValidCPUList(SmallVectorImpl<StringRef> &Values) const override;
};

} // namespace targets
Expand Down
7 changes: 7 additions & 0 deletions clang/lib/CodeGen/CGDebugInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1022,6 +1022,13 @@ llvm::DIType *CGDebugInfo::CreateType(const BuiltinType *BT) {
DBuilder.createBasicType(Name, Width, llvm::dwarf::DW_ATE_unsigned); \
return SingletonId; \
}
#define AMDGPU_FEATURE_PREDICATE_TYPE(Name, Id, SingletonId, Width, Align) \
case BuiltinType::Id: { \
if (!SingletonId) \
SingletonId = \
DBuilder.createBasicType(Name, Width, llvm::dwarf::DW_ATE_boolean); \
return SingletonId; \
}
#include "clang/Basic/AMDGPUTypes.def"
case BuiltinType::UChar:
case BuiltinType::Char_U:
Expand Down
4 changes: 4 additions & 0 deletions clang/lib/CodeGen/CGExprScalar.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -982,6 +982,10 @@ Value *ScalarExprEmitter::EmitConversionToBool(Value *Src, QualType SrcType) {
if (const MemberPointerType *MPT = dyn_cast<MemberPointerType>(SrcType))
return CGF.CGM.getCXXABI().EmitMemberPointerIsNotNull(CGF, Src, MPT);

// The conversion is a NOP, and will be done when CodeGening the builtin.
if (SrcType == CGF.getContext().AMDGPUFeaturePredicateTy)
return Src;

assert((SrcType->isIntegerType() || isa<llvm::PointerType>(Src->getType())) &&
"Unknown scalar type to convert");

Expand Down
3 changes: 3 additions & 0 deletions clang/lib/CodeGen/CodeGenTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,9 @@ llvm::Type *CodeGenTypes::ConvertType(QualType T) {
case BuiltinType::Id: \
return llvm::TargetExtType::get(getLLVMContext(), "amdgcn.named.barrier", \
{}, {Scope});
#define AMDGPU_FEATURE_PREDICATE_TYPE(Name, Id, SingletonId, Width, Align) \
case BuiltinType::Id: \
return ConvertType(getContext().getLogicalOperationType());
#include "clang/Basic/AMDGPUTypes.def"
#define HLSL_INTANGIBLE_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
#include "clang/Basic/HLSLIntangibleTypes.def"
Expand Down
29 changes: 29 additions & 0 deletions clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,18 @@ void CodeGenFunction::AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst,
Inst->setMetadata(LLVMContext::MD_mmra, MMRAMetadata::getMD(Ctx, MMRAs));
}

static Value *GetOrInsertAMDGPUPredicate(CodeGenFunction &CGF, Twine Name) {
auto PTy = IntegerType::getInt1Ty(CGF.getLLVMContext());

auto *P = cast<GlobalVariable>(
CGF.CGM.getModule().getOrInsertGlobal(Name.str(), PTy));
P->setConstant(true);
P->setExternallyInitialized(true);

return CGF.Builder.CreateLoad(
RawAddress(P, PTy, CharUnits::One(), KnownNonNull));
}

Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
const CallExpr *E) {
llvm::AtomicOrdering AO = llvm::AtomicOrdering::SequentiallyConsistent;
Expand Down Expand Up @@ -601,6 +613,23 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
llvm::Value *Env = EmitScalarExpr(E->getArg(0));
return Builder.CreateCall(F, {Env});
}
case AMDGPU::BI__builtin_amdgcn_processor_is: {
assert(CGM.getTriple().isSPIRV() &&
"__builtin_amdgcn_processor_is should never reach CodeGen for "
"concrete targets!");
StringRef Proc = cast<clang::StringLiteral>(E->getArg(0))->getString();
return GetOrInsertAMDGPUPredicate(*this, "llvm.amdgcn.is." + Proc);
}
case AMDGPU::BI__builtin_amdgcn_is_invocable: {
assert(CGM.getTriple().isSPIRV() &&
"__builtin_amdgcn_is_invocable should never reach CodeGen for "
"concrete targets!");
auto *FD = cast<FunctionDecl>(
cast<DeclRefExpr>(E->getArg(0))->getReferencedDeclOfCallee());
StringRef RF =
getContext().BuiltinInfo.getRequiredFeatures(FD->getBuiltinID());
return GetOrInsertAMDGPUPredicate(*this, "llvm.amdgcn.has." + RF);
}
case AMDGPU::BI__builtin_amdgcn_read_exec:
return EmitAMDGCNBallotForExec(*this, E, Int64Ty, Int64Ty, false);
case AMDGPU::BI__builtin_amdgcn_read_exec_lo:
Expand Down
7 changes: 6 additions & 1 deletion clang/lib/Sema/JumpDiagnostics.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "clang/AST/StmtOpenACC.h"
#include "clang/AST/StmtOpenMP.h"
#include "clang/Basic/SourceLocation.h"
#include "clang/Sema/SemaAMDGPU.h"
#include "clang/Sema/SemaInternal.h"
#include "llvm/ADT/BitVector.h"
using namespace clang;
Expand Down Expand Up @@ -367,15 +368,19 @@ void JumpScopeChecker::BuildScopeInformation(Stmt *S,

case Stmt::IfStmtClass: {
IfStmt *IS = cast<IfStmt>(S);
bool AMDGPUPredicate = false;
if (!(IS->isConstexpr() || IS->isConsteval() ||
IS->isObjCAvailabilityCheck()))
IS->isObjCAvailabilityCheck() ||
(AMDGPUPredicate = this->S.AMDGPU().IsPredicate(IS->getCond()))))
break;

unsigned Diag = diag::note_protected_by_if_available;
if (IS->isConstexpr())
Diag = diag::note_protected_by_constexpr_if;
else if (IS->isConsteval())
Diag = diag::note_protected_by_consteval_if;
else if (AMDGPUPredicate)
Diag = diag::note_amdgcn_protected_by_predicate;

if (VarDecl *Var = IS->getConditionVariable())
BuildScopeInformation(Var, ParentScope);
Expand Down
Loading
Loading