-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Clang] Add __has_target_builtin macro #126324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You can test this locally with the following command:git-clang-format --diff 196a1acc7d277d05d4b94ad7745c18bf13ea991f 2668404f9a46155a897030aca29e117f4319f9a5 --extensions cpp,h -- clang/test/Preprocessor/has_target_builtin.cpp clang/include/clang/Lex/Preprocessor.h clang/lib/Lex/PPMacroExpansion.cpp View the diff from clang-format here.diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 8be4d06c48..80ac767f16 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -1894,8 +1894,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
return false;
});
- } else if (II == Ident__has_cpp_attribute ||
- II == Ident__has_c_attribute) {
+ } else if (II == Ident__has_cpp_attribute || II == Ident__has_c_attribute) {
bool IsCXX = II == Ident__has_cpp_attribute;
EvaluateFeatureLikeBuiltinMacro(OS, Tok, II, *this, true,
[&](Token &Tok, bool &HasLexedNextToken) -> int {
@@ -1925,8 +1924,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
getLangOpts())
: 0;
});
- } else if (II == Ident__has_include ||
- II == Ident__has_include_next) {
+ } else if (II == Ident__has_include || II == Ident__has_include_next) {
// The argument to these two builtins should be a parenthesized
// file name string literal using angle brackets (<>) or
// double-quotes ("").
|
@llvm/pr-subscribers-clang Author: Nick Sarnie (sarnex) ChangesAs a follow-up to #121839, where we wanted to make Instead, introduce a new macro Full diff: https://github.com/llvm/llvm-project/pull/126324.diff 4 Files Affected:
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 973cf8f9d091c30..057ad564f970bb4 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -67,6 +67,10 @@ It can be used like this:
``__has_builtin`` should not be used to detect support for a builtin macro;
use ``#ifdef`` instead.
+ When using device offloading, a builtin is considered available if it is
+ available on either the host or the device targets.
+ Use ``__has_target_builtin`` to consider only the current target.
+
``__has_constexpr_builtin``
---------------------------
@@ -96,6 +100,35 @@ the ``<cmath>`` header file to conditionally make a function constexpr whenever
the constant evaluation of the corresponding builtin (for example,
``std::fmax`` calls ``__builtin_fmax``) is supported in Clang.
+``__has_target_builtin``
+------------------------
+
+This function-like macro takes a single identifier argument that is the name of
+a builtin function, a builtin pseudo-function (taking one or more type
+arguments), or a builtin template.
+It evaluates to 1 if the builtin is supported on the current target or 0 if not.
+The behavior is different than ``__has_builtin`` when there is an auxiliary target,
+such when offloading to a target device.
+It can be used like this:
+
+.. code-block:: c++
+
+ #ifndef __has_target_builtin // Optional of course.
+ #define __has_target_builtin(x) 0 // Compatibility with non-clang compilers.
+ #endif
+
+ ...
+ #if __has_target_builtin(__builtin_trap)
+ __builtin_trap();
+ #else
+ abort();
+ #endif
+ ...
+
+.. note::
+ ``__has_target_builtin`` should not be used to detect support for a builtin macro;
+ use ``#ifdef`` instead.
+
.. _langext-__has_feature-__has_extension:
``__has_feature`` and ``__has_extension``
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index 2bf4d1a16699430..240fe28aba93e33 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -174,6 +174,7 @@ class Preprocessor {
IdentifierInfo *Ident__has_extension; // __has_extension
IdentifierInfo *Ident__has_builtin; // __has_builtin
IdentifierInfo *Ident__has_constexpr_builtin; // __has_constexpr_builtin
+ IdentifierInfo *Ident__has_target_builtin; // __has_target_builtin
IdentifierInfo *Ident__has_attribute; // __has_attribute
IdentifierInfo *Ident__has_embed; // __has_embed
IdentifierInfo *Ident__has_include; // __has_include
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 347c13da0ad215a..23a693b105fca3a 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -357,6 +357,7 @@ void Preprocessor::RegisterBuiltinMacros() {
Ident__has_builtin = RegisterBuiltinMacro("__has_builtin");
Ident__has_constexpr_builtin =
RegisterBuiltinMacro("__has_constexpr_builtin");
+ Ident__has_target_builtin = RegisterBuiltinMacro("__has_target_builtin");
Ident__has_attribute = RegisterBuiltinMacro("__has_attribute");
if (!getLangOpts().CPlusPlus)
Ident__has_c_attribute = RegisterBuiltinMacro("__has_c_attribute");
@@ -1797,55 +1798,62 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
diag::err_feature_check_malformed);
return II && HasExtension(*this, II->getName());
});
- } else if (II == Ident__has_builtin) {
- EvaluateFeatureLikeBuiltinMacro(OS, Tok, II, *this, false,
- [this](Token &Tok, bool &HasLexedNextToken) -> int {
- IdentifierInfo *II = ExpectFeatureIdentifierInfo(Tok, *this,
- diag::err_feature_check_malformed);
- if (!II)
- return false;
- else if (II->getBuiltinID() != 0) {
- switch (II->getBuiltinID()) {
- case Builtin::BI__builtin_cpu_is:
- return getTargetInfo().supportsCpuIs();
- case Builtin::BI__builtin_cpu_init:
- return getTargetInfo().supportsCpuInit();
- case Builtin::BI__builtin_cpu_supports:
- return getTargetInfo().supportsCpuSupports();
- case Builtin::BI__builtin_operator_new:
- case Builtin::BI__builtin_operator_delete:
- // denotes date of behavior change to support calling arbitrary
- // usual allocation and deallocation functions. Required by libc++
- return 201802;
- default:
- return Builtin::evaluateRequiredTargetFeatures(
- getBuiltinInfo().getRequiredFeatures(II->getBuiltinID()),
- getTargetInfo().getTargetOpts().FeatureMap);
+ } else if (II == Ident__has_builtin || II == Ident__has_target_builtin) {
+ bool IsHasTargetBuiltin = II == Ident__has_target_builtin;
+ EvaluateFeatureLikeBuiltinMacro(
+ OS, Tok, II, *this, false,
+ [this, IsHasTargetBuiltin](Token &Tok, bool &HasLexedNextToken) -> int {
+ IdentifierInfo *II = ExpectFeatureIdentifierInfo(
+ Tok, *this, diag::err_feature_check_malformed);
+ if (!II)
+ return false;
+ auto BuiltinID = II->getBuiltinID();
+ if (BuiltinID != 0) {
+ switch (BuiltinID) {
+ case Builtin::BI__builtin_cpu_is:
+ return getTargetInfo().supportsCpuIs();
+ case Builtin::BI__builtin_cpu_init:
+ return getTargetInfo().supportsCpuInit();
+ case Builtin::BI__builtin_cpu_supports:
+ return getTargetInfo().supportsCpuSupports();
+ case Builtin::BI__builtin_operator_new:
+ case Builtin::BI__builtin_operator_delete:
+ // denotes date of behavior change to support calling arbitrary
+ // usual allocation and deallocation functions. Required by libc++
+ return 201802;
+ default:
+ // __has_target_builtin should return false for aux builtins.
+ if (IsHasTargetBuiltin &&
+ getBuiltinInfo().isAuxBuiltinID(BuiltinID))
+ return false;
+ return Builtin::evaluateRequiredTargetFeatures(
+ getBuiltinInfo().getRequiredFeatures(BuiltinID),
+ getTargetInfo().getTargetOpts().FeatureMap);
+ }
+ return true;
+ } else if (IsBuiltinTrait(Tok)) {
+ return true;
+ } else if (II->getTokenID() != tok::identifier &&
+ II->getName().starts_with("__builtin_")) {
+ return true;
+ } else {
+ return llvm::StringSwitch<bool>(II->getName())
+ // Report builtin templates as being builtins.
+ .Case("__make_integer_seq", getLangOpts().CPlusPlus)
+ .Case("__type_pack_element", getLangOpts().CPlusPlus)
+ .Case("__builtin_common_type", getLangOpts().CPlusPlus)
+ // Likewise for some builtin preprocessor macros.
+ // FIXME: This is inconsistent; we usually suggest detecting
+ // builtin macros via #ifdef. Don't add more cases here.
+ .Case("__is_target_arch", true)
+ .Case("__is_target_vendor", true)
+ .Case("__is_target_os", true)
+ .Case("__is_target_environment", true)
+ .Case("__is_target_variant_os", true)
+ .Case("__is_target_variant_environment", true)
+ .Default(false);
}
- return true;
- } else if (IsBuiltinTrait(Tok)) {
- return true;
- } else if (II->getTokenID() != tok::identifier &&
- II->getName().starts_with("__builtin_")) {
- return true;
- } else {
- return llvm::StringSwitch<bool>(II->getName())
- // Report builtin templates as being builtins.
- .Case("__make_integer_seq", getLangOpts().CPlusPlus)
- .Case("__type_pack_element", getLangOpts().CPlusPlus)
- .Case("__builtin_common_type", getLangOpts().CPlusPlus)
- // Likewise for some builtin preprocessor macros.
- // FIXME: This is inconsistent; we usually suggest detecting
- // builtin macros via #ifdef. Don't add more cases here.
- .Case("__is_target_arch", true)
- .Case("__is_target_vendor", true)
- .Case("__is_target_os", true)
- .Case("__is_target_environment", true)
- .Case("__is_target_variant_os", true)
- .Case("__is_target_variant_environment", true)
- .Default(false);
- }
- });
+ });
} else if (II == Ident__has_constexpr_builtin) {
EvaluateFeatureLikeBuiltinMacro(
OS, Tok, II, *this, false,
@@ -1886,8 +1894,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
return false;
});
- } else if (II == Ident__has_cpp_attribute ||
- II == Ident__has_c_attribute) {
+ } else if (II == Ident__has_cpp_attribute || II == Ident__has_c_attribute) {
bool IsCXX = II == Ident__has_cpp_attribute;
EvaluateFeatureLikeBuiltinMacro(OS, Tok, II, *this, true,
[&](Token &Tok, bool &HasLexedNextToken) -> int {
@@ -1917,8 +1924,7 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
getLangOpts())
: 0;
});
- } else if (II == Ident__has_include ||
- II == Ident__has_include_next) {
+ } else if (II == Ident__has_include || II == Ident__has_include_next) {
// The argument to these two builtins should be a parenthesized
// file name string literal using angle brackets (<>) or
// double-quotes ("").
diff --git a/clang/test/Preprocessor/has_target_builtin.cpp b/clang/test/Preprocessor/has_target_builtin.cpp
new file mode 100644
index 000000000000000..64b2d7e1b35d9ef
--- /dev/null
+++ b/clang/test/Preprocessor/has_target_builtin.cpp
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fopenmp -triple=spirv64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=nvptx64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=amdgcn-amd-amdhsa -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// RUN: %clang_cc1 -fopenmp -triple=aarch64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// CHECK: GOOD
+#if __has_target_builtin(__builtin_ia32_pause)
+ BAD
+#else
+ GOOD
+#endif
|
@@ -357,6 +357,7 @@ void Preprocessor::RegisterBuiltinMacros() { | |||
Ident__has_builtin = RegisterBuiltinMacro("__has_builtin"); | |||
Ident__has_constexpr_builtin = | |||
RegisterBuiltinMacro("__has_constexpr_builtin"); | |||
Ident__has_target_builtin = RegisterBuiltinMacro("__has_target_builtin"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to define this macro for offloading languages only. The reason is that non-offloading languages do not need this macro but if they start to use this macro then it will break again in offloading languages like __has_builtin
did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the usage would commonly be
#if defined(__has_target_builtin) && __has_target_builtin(foo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that some C++ library headers start to use this macro __has_target_builtin
in place of __has_builtin
, and we cannot modify such headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will do this. I can't find a good way to detect offloading languages in general here, so I'm just going to check for CUDA/HIP/SYCLDevice/OpenMPDevice, let me know if there's some common logic I can rely on that I missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that some C++ library headers start to use this macro
__has_target_builtin
in place of__has_builtin
, and we cannot modify such headers.
IMO, now that we do document semantics of __has_target_builtin()
, its misuse on the library side will be their problem to fix. The problem with __has_builtin()
was that it was never intended to handle heterogeneous compilation, and that's what created the issue when CUDA/HIP made builtins from both host and device visible to the compiler, but not all of them codegen-able. __has_target_builtin()
clearly states what to expect. Sure, it's possible to misuse it, but having it available unconditionally will make it much less cumbersome to use in the headers shared between CUDA and C++, and that's a fairly common use case.
I'd prefer to have __has_target_builtin()
generally available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, __has_target_builtin()
is probably identical to __has_builtin
on non-offloading related things. It's up to them if they keep it portable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it unconditional again in the latest commit
i kept the code example in langref because probably we still want to recommend only using this for offloading targets, even though it will work on non-offloading targets. let met know if you disagree.
need release note |
Feedback from #126324 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
…26571) Feedback from llvm/llvm-project#126324 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
clang/docs/LanguageExtensions.rst
Outdated
``__has_target_builtin`` should not be used to detect support for a builtin macro; | ||
use ``#ifdef`` instead. | ||
|
||
``__has_target_built`` is only defined for offloading targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``__has_target_built`` is only defined for offloading targets. | |
``__has_target_builtin`` is only defined for offloading targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow thats embarrassing, thanks
clang/docs/LanguageExtensions.rst
Outdated
arguments), or a builtin template. | ||
It evaluates to 1 if the builtin is supported on the current target or 0 if not. | ||
The behavior is different than ``__has_builtin`` when there is an auxiliary target, | ||
such when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such when offloading to a target device. | |
such as when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cant english today, thanks
clang/docs/LanguageExtensions.rst
Outdated
#ifndef __has_target_builtin // Optional of course. | ||
#define __has_target_builtin(x) 0 // Compatibility with non-clang compilers. | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be more helpful to do something like ifdef CUDA ... else __has_builtin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully the latest commit has the use case youre looking for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we make it available to C++, we'd better document the following invalid usage which originally leads to this extension:
#if !__has_target_builtin(__wfi)
static __inline__ void __attribute__((__always_inline__, __nodebug__)) __wfi(void) {
__builtin_arm_wfi();
}
#endif
we should emphasize that a C++ header may be used by offloading languages, and in offloading language, the same source is compiled for host and device target separately. A builtin not available for the current target does not justify defining the builtin for both host and device targets. In this case, better to use __has_builtin(__wfi)
since it makes sure the condition is true for both hosts and device targets so that the code won't break when used in offloading languages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have somewhat conflicting requirements:
- On C++ side, writers do not care about offloading (and we can't force them to). They only have
__has_builtin()
and it does what they need -- if the given builtin exists it will be compileable. - On offloading side, we want C++ headers to work out of the box for the host side. Ideally with the host and device compilations seeing the same code after preprocessing, and that's where we get into this problem. We can't tell whether the original C++ code needs
__has_builtin()
(works well enough for most uses inside of host function bodies) or if it needs__has_target_builtin()
(e.g. when it's used inside a lambda or constexpr function which is implicitly HD, and we do need to generate code for it).
I'm not sure we can find a universal solution. That said, __has_target_builtin()
gives us some flexibility on the offloading side. C++ side should stick with __has_builtin()
. __has_target_builtin()
should only be used when offloading comes into the picture, but it includes the possibility that it will be used in the headers shared with C++ and therefore the builtin itself should be available there.
clang/docs/LanguageExtensions.rst
Outdated
This function-like macro takes a single identifier argument that is the name of | ||
a builtin function, a builtin pseudo-function (taking one or more type | ||
arguments), or a builtin template. | ||
It evaluates to 1 if the builtin is supported on the current target or 0 if not. | ||
The behavior is different than ``__has_builtin`` when there is an auxiliary target, | ||
such when offloading to a target device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rephrase it to be more specific in terms of what the difference is rather than when it occurs.
__has_builtin()
and __has_target_builtin()
behave identically for normal C++ compilations.
For heterogeneous compilations that see source code intended for more than one target
__has_builtin()
returns true if the builtin is known to the compiler (i.e. it's available via one of the targets), but makes no promises whether it's available on the current target. We can parse it, but not necessarily codegen it.__has_target_builtin()
returns true if the builtin can actually be codegen'ed for the current target.
__has_target_builtin()
is, effectively, functional superset of CUDA's __CUDA_ARCH__
-- it allows distinguishing both host and target architectures. It has to be treated with similar caution so it does not break consistency of the TU source code seen by the compiler across sub-compilations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, i like the way you worded it so i'll use most of this verbatim
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some tiny nits from me
clang/lib/Lex/PPMacroExpansion.cpp
Outdated
IdentifierInfo *II = ExpectFeatureIdentifierInfo( | ||
Tok, *this, diag::err_feature_check_malformed); | ||
if (!II) | ||
return false; | ||
else if (II->getBuiltinID() != 0) { | ||
switch (II->getBuiltinID()) { | ||
auto BuiltinID = II->getBuiltinID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spell out the type explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit, thx
// CHECK-NOTOFFLOAD: DOESNT | ||
#ifdef __has_target_builtin | ||
HAS | ||
#if __has_target_builtin(__builtin_ia32_pause) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add test coverage for when the target does have the builtin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest commit, thx
|
||
// CHECK-NOTOFFLOAD: DOESNT | ||
#ifdef __has_target_builtin | ||
HAS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably check for HAS
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the macro is unconditionally available in the latest commit so i removed the checking for the macro being defined
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Closest is @Artem-B, there's also a few NVIDIA people floating around like @gonzalobg. |
Setting up offloading for GCC was a nightmare, but it looks like the behavior is the same as clang (available builtins are the union of host + offload). I tried this example:
with flags:
and i got
|
Ouch! Thank you for doing that investigation! @pinskia -- do you happen to know if this is intentional? GCC's documentation says "recognizes" which could mean this is intended, but I wasn't 100% sure. Do you think GCC has an appetite for changing the behavior or perhaps deprecating the interface and adding |
What is the definition of current target? Is it the current subtarget? Or is it returning true even when NOT using I am not defintely the wrong person to ask about offloading and |
Maybe the problem with the name I am having is the word |
That's actually a really good idea, thank you! But I think users will still have the very reasonable question of why there's a |
I submitted https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118882 to ask the other GCC folks about what they think is a good idea because I have no other thoughts on this; I just think something needs to be done. |
my $.02 For the classic compilation, |
Feedback from llvm#126324 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Sometimes I wonder if we should have a single way to detect if we're doing any kind of offloading so we can guard stuff like this. |
@@ -1819,8 +1822,12 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) { | |||
// usual allocation and deallocation functions. Required by libc++ | |||
return 201802; | |||
default: | |||
// __has_target_builtin should return false for aux builtins. | |||
if (IsHasTargetBuiltin && | |||
getBuiltinInfo().isAuxBuiltinID(BuiltinID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic doesn't seem right to me. What happens if this builtin is supported on BOTH the host and device here? Shouldn't we still return 'true'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. The function name is isAuxBuiltinID
but the implementation of it makes it so that the behavior is like isOnlyAuxBuiltinID
. The implementation of the function is:
/// Return true if builtin ID belongs to AuxTarget.
bool isAuxBuiltinID(unsigned ID) const {
return ID >= (Builtin::FirstTSBuiltin + NumTargetBuiltins);
}
When we register builtins, we do it like this:
TargetShards = Target.getTargetBuiltins();
for (const auto &Shard : TargetShards)
NumTargetBuiltins += Shard.Infos.size();
if (AuxTarget) {
AuxTargetShards = AuxTarget->getTargetBuiltins();
for (const auto &Shard : AuxTargetShards)
NumAuxTargetBuiltins += Shard.Infos.size();
}
}
So we register all the target builtins before the aux target builtins, and we only consider something an aux builtin if it was registered specifically as an aux builtin, so I think the logic does what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Neat! Can you document that somewhere on this? Just extending the comment here will be really helpful to the next person along.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, opened PR here and added you as a reviewer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Thanks! I meant here as well though, it was REALLY jarring to me and I suspect next one here will ahve the same problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah sure, will update this pr as well
@AaronBallman What do you recommend for next steps here? Should we wait until GCC makes a decision in this issue? |
I'd vote for fixing the CUDA on Arm case that failed in the meantime, then make a decision as to whether or not we should go back to |
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Clarify behavior of the function when the builtin is also supported on the main target. Based on feedback from #126324 --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Clarify behavior of the function when the builtin is also supported on the main target. Based on feedback from llvm/llvm-project#126324 --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
I don't think we need to wait for GCC to make a decision, but we can wait for a bit for them to consider the issue, and should make sure we coordinate closely with GCC if we make a decision on our own.
+1, if we can get away with it. As for next steps, I think we need a broader community discussion on this, so I would recommend an RFC proposing an approach. I don't know whether that's changing the behavior of |
I'm not a fan of deprecating |
+1 to that. |
I'll try to fix the CUDA Arm case and then draft an RFC. This turned out to be quite the change :P |
PR for ARM CUDA #128222 |
Feedback from llvm#126324 Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
ARM PR looks kinda stalled, hopefully can write the RFC this week |
So far RFC is going towards changing |
As a follow-up to #121839, where we wanted to make
__has_builtin
return false for aux builtins, but that broke existing code.Instead, introduce a new macro
__has_target_builtin
(name open to suggestions) that only considers builtins for the current target.