Skip to content

[SYCL][CUDA] Define __SYCL_CUDA_ARCH__ instead of __CUDA_ARCH__ for SYCL #8257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 21, 2023

Conversation

GeorgeWeb
Copy link
Contributor

@GeorgeWeb GeorgeWeb commented Feb 8, 2023

This PR addresses an issue where if we use __CUDA_ARCH__ causes intrinsics not to be defined in the CUDA include files.

  • Replace __CUDA_ARCH__ with __SYCL_CUDA_ARCH__ for SYCL device
  • Update the sycl-macro.cpp test to check the appropriate macro.

As far as I could find the original issue was introduced from PR #6524 for enabling the bfloat16 support moving it from the experimental extension, and it breaks some codebases with CUDA interop calls.
Current reports include github issues #7722, #8133 and oneapi-src/oneMKL#257.

For that reason we define a unique __SYCL_CUDA_ARCH__ macro and use it instead for SYCL device targets and leave __CUDA_ARCH__ as before for CUDA targets.

@GeorgeWeb GeorgeWeb marked this pull request as ready for review February 8, 2023 18:39
@GeorgeWeb GeorgeWeb requested review from a team as code owners February 8, 2023 18:39
@GeorgeWeb GeorgeWeb requested a review from bso-intel February 8, 2023 18:39
@GeorgeWeb GeorgeWeb temporarily deployed to aws February 8, 2023 20:21 — with GitHub Actions Inactive
@hdelan
Copy link
Contributor

hdelan commented Feb 8, 2023

Otherwise LGTM!

@zjin-lcf
Copy link
Contributor

zjin-lcf commented Feb 8, 2023

I have a question. Is __SYCL_HIP_ARCH__ needed ?

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes LGTM

@hdelan
Copy link
Contributor

hdelan commented Feb 9, 2023

Hi @zjin-lcf we don't want to add too many of these arch checkers in DPC++ if we can help it since they are a little hacky. We need SYCL_CUDA_ARCH because some of the headers use it and it reduces the amount of code we compile. Nevertheless we can get this kind of functionality with AMD HIP backend by declaring an extern int that will be resolved at llvm-link time:

https://github.com/intel/llvm/blob/sycl/libclc/amdgcn-amdhsa/libspirv/atomic/atomic_add.cl#L13

We can then check the arch value like this:

https://github.com/intel/llvm/blob/sycl/libclc/amdgcn-amdhsa/libspirv/atomic/atomic_helpers.h#L14

@GeorgeWeb GeorgeWeb temporarily deployed to aws February 9, 2023 10:58 — with GitHub Actions Inactive
@abagusetty
Copy link
Contributor

@GeorgeWeb Thanks for the PR. I can verify, this was addressed uxlfoundation/oneMath#257

Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes look OK to me

Copy link
Contributor

@premanandrao premanandrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes look okay to me.

@abagusetty
Copy link
Contributor

Hi @bso-intel , @intel/llvm-gatekeepers Can we get some help with this PR. It would unblock several issues mentioned above. Thanks

@bader
Copy link
Contributor

bader commented Feb 20, 2023

@abagusetty, code review from code owners is missing. Please, ping @intel/llvm-reviewers-runtime team.

@abagusetty
Copy link
Contributor

@intel/llvm-reviewers-runtime Could you please help with the review

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a reasonable solution! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants