-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][CUDA] Define __SYCL_CUDA_ARCH__ instead of __CUDA_ARCH__ for SYCL #8257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][CUDA] Define __SYCL_CUDA_ARCH__ instead of __CUDA_ARCH__ for SYCL #8257
Conversation
Otherwise LGTM! |
I have a question. Is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE changes LGTM
Hi @zjin-lcf we don't want to add too many of these arch checkers in DPC++ if we can help it since they are a little hacky. We need SYCL_CUDA_ARCH because some of the headers use it and it reduces the amount of code we compile. Nevertheless we can get this kind of functionality with AMD HIP backend by declaring an extern int that will be resolved at llvm-link time: https://github.com/intel/llvm/blob/sycl/libclc/amdgcn-amdhsa/libspirv/atomic/atomic_add.cl#L13 We can then check the arch value like this: https://github.com/intel/llvm/blob/sycl/libclc/amdgcn-amdhsa/libspirv/atomic/atomic_helpers.h#L14 |
@GeorgeWeb Thanks for the PR. I can verify, this was addressed uxlfoundation/oneMath#257 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE changes look OK to me
__CUDA_ARCH__
defined when compiling for CUDA backend since sycl-nightly/20221129
#7722
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FE changes look okay to me.
Hi @bso-intel , @intel/llvm-gatekeepers Can we get some help with this PR. It would unblock several issues mentioned above. Thanks |
@abagusetty, code review from code owners is missing. Please, ping @intel/llvm-reviewers-runtime team. |
@intel/llvm-reviewers-runtime Could you please help with the review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a reasonable solution! Thanks!
This PR addresses an issue where if we use
__CUDA_ARCH__
causes intrinsics not to be defined in the CUDA include files.__CUDA_ARCH__
with__SYCL_CUDA_ARCH__
for SYCL devicesycl-macro.cpp
test to check the appropriate macro.As far as I could find the original issue was introduced from PR #6524 for enabling the bfloat16 support moving it from the experimental extension, and it breaks some codebases with CUDA interop calls.
Current reports include github issues #7722, #8133 and oneapi-src/oneMKL#257.
For that reason we define a unique
__SYCL_CUDA_ARCH__
macro and use it instead for SYCL device targets and leave__CUDA_ARCH__
as before for CUDA targets.