Skip to content

[SYCL][OpenCL] Ban AMD OpenCL platform (#5825) #6878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2023

Conversation

al42and
Copy link
Contributor

@al42and al42and commented Sep 26, 2022

Compiling kernels for it is not supported, device info sometimes throws, and there is the HIP backend for those wanting to use an AMD GPU.

$ sycl-ls  # before
[opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055]
[opencl:gpu:1] AMD Accelerated Parallel Processing, gfx1032 2.0 [3452.0 (HSA1.1,LC)]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7]
[ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15]

$ sycl-ls  # after
[opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 770 [0x4680] 3.0 [22.35.24055]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 770 [0x4680] 1.3 [1.3.24055]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 11.7]
[ext_oneapi_hip:gpu:0] AMD HIP BACKEND, gfx1032 0.0 [HIP 50221.15]

Closes #5825

@al42and al42and requested a review from a team as a code owner September 26, 2022 16:29
@al42and al42and requested a review from v-klochkov September 26, 2022 16:29
Copy link
Contributor

@v-klochkov v-klochkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me, but I may not know some plans regarding supported platforms, etc.
@pvchupin can you please assign someone from your team to review this PR?

@v-klochkov v-klochkov requested a review from a team September 29, 2022 02:48
@al42and al42and changed the title [SYCL][OpenCL] Blacklist AMD platform (#5825) [SYCL][OpenCL] Ban AMD OpenCL platform (#5825) Sep 29, 2022
@steffenlarsen
Copy link
Contributor

If OpenCL for AMD GPUs is being problematic, I am okay with giving it the same treatment as we have done for NVIDIA GPUs. @AerialMantis - Is this something you have tested and/or had problems with before?

@al42and
Copy link
Contributor Author

al42and commented Mar 24, 2023

May I bump this? OpenCL backend still does not support AMD GPUs as far as I can tell, and listing them among available devices is just confusing.

@bader bader requested a review from npmiller March 24, 2023 15:24
Copy link
Contributor

@npmiller npmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as far as I'm aware this is also not working, like the Nvidia one, I'm happy to ban it as well for now.

@bader
Copy link
Contributor

bader commented Mar 24, 2023

@al42and, could you update your branch, please? I see that it was tested in September last year, I'd like to make sure it still passes the tests.
@steffenlarsen, @v-klochkov, could you formally approve the patch, please?

On comment from my side, I think we should discard devices based on features they do not support (e.g. required SPIR-V extension) instead of relying on platform name. This suggestion is better implement in follow-up patches.

Compiling kernels for it is not supported, device info sometimes throws,
and there is HIP backend.

Closes intel#5825
@al42and al42and force-pushed the blacklist-amd-opencl branch from 0b28502 to d6dac11 Compare March 24, 2023 17:05
@al42and
Copy link
Contributor Author

al42and commented Mar 24, 2023

@al42and, could you update your branch, please? I see that it was tested in September last year, I'd like to make sure it still passes the tests.

Rebased on top of the current sycl branch.

@al42and al42and temporarily deployed to aws March 24, 2023 18:00 — with GitHub Actions Inactive
@al42and al42and temporarily deployed to aws March 24, 2023 19:16 — with GitHub Actions Inactive
@al42and
Copy link
Contributor Author

al42and commented Mar 24, 2023

Failed Tests (1):
    SYCL :: XPTI/kernel/basic.cpp

There are other failing GitHub Actions with the same error, so I guess that's not this PRs fault

@bader
Copy link
Contributor

bader commented Mar 24, 2023

Failed Tests (1):
    SYCL :: XPTI/kernel/basic.cpp

There are other failing GitHub Actions with the same error, so I guess that's not this PRs fault

This issue was caused by 18899cc, which was reverted in 01511a3.

@bader bader merged commit f61a136 into intel:sycl Mar 24, 2023
@bader
Copy link
Contributor

bader commented Mar 24, 2023

On comment from my side, I think we should discard devices based on features they do not support (e.g. required SPIR-V extension) instead of relying on platform name. This suggestion is better implement in follow-up patches.

In addition to that it can be useful to list discarded devices with --verbose flag with a clear mark + a comment about why were discarded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ROCm OpenCL] device::get_info<device::sub_group_sizes> throws Native API failed
5 participants