Skip to content

[SYCL] Enable AMD GPU support. #3795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jun 24, 2021
Merged

[SYCL] Enable AMD GPU support. #3795

merged 23 commits into from
Jun 24, 2021

Conversation

malixian
Copy link
Contributor

@malixian malixian commented May 21, 2021

Enable AMG GPU for DPC++. To support this feature, We did two parts of development, namely the compilation tool chain and the runtime. We have implemented rocm-plugin refer to cuda-plugin. Many test cases in this project have passed, but there are still some problems which can be roughly divided into three areas:
First, some errors occurred at link time. like all-pairs-disance-sycl case occurred error: undefined hidden symbol: __spirv_ControlBarrier will appear during the lld link period due to use barrier(access::fence_space::local_space). Similarly, the undefined__spirv_SubgroupShuffleINTEL error occurs when the cl::sycl::atomic keyword is used at that time.
Second, some errors occurred at runime. the program will core dump when calling hipMemcpyDtoHAsync API due to allocating memory size is too large. But cuda does not have this problem.
Finally, calculation accuracy problem. Currently, the calculation of float type kernel functions is inaccurate, and there is no problem with int type testing. We haven't figured out where the problem is.
We will keep track of the above issues.

@bader bader changed the title enable amd gpu [SYCL] Enable AMD GPU support. May 21, 2021
@AGindinson AGindinson requested a review from AlexeySachkov May 21, 2021 08:05
@@ -141,6 +150,7 @@ def main():
parser.add_argument("-t", "--build-type",
metavar="BUILD_TYPE", default="Release", help="build type: Debug, Release")
parser.add_argument("--cuda", action='store_true', help="switch from OpenCL to CUDA")
parser.add_argument("--rocm", action='store_true', help="swith from OpenCL to ROCM")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malixian, one note.
As of today, we don't have AMD GPU HW in our CI system, so we won't be able to validate that other contributions do not break ROCM support.
I think we should decide ASAP, how ROCM support is going to be verified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for plugin we can use hip<->cuda source code compatibility

@AGindinson AGindinson requested a review from pvchupin May 21, 2021 11:52
Copy link
Contributor

@alexbatashev alexbatashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking changes are not allowed. Please, fix backend enum.

@pvchupin
Copy link
Contributor

@malixian, thanks for contribution! Please update documentation, https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md at least.

@malixian malixian closed this May 25, 2021
@malixian
Copy link
Contributor Author

@malixian, thanks for contribution! Please update documentation, https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md at least.

All right.

@malixian
Copy link
Contributor Author

We update the code and GetStartedGuide.md document for AMDGPU.

@AGindinson AGindinson dismissed stale reviews from themself via 7e38d88 June 24, 2021 05:43
AGindinson
AGindinson previously approved these changes Jun 24, 2021
AGindinson
AGindinson previously approved these changes Jun 24, 2021
@bader bader dismissed stale reviews from AGindinson via d3e2775 June 24, 2021 06:31
AGindinson
AGindinson previously approved these changes Jun 24, 2021
alexbatashev
alexbatashev previously approved these changes Jun 24, 2021
Copy link
Contributor

@AGindinson AGindinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion with @bader, adding & committing 3 suggestions (tested locally) to:

  1. Apply the Driver LIT patch correctly to pass the checks
  2. Remove __SYCL_AMDGCN__ macro due to the similar efforts for __SYCL_NVPTX__ in [SYCL] Remove the __SYCL_NVPTX__ macro #3977

@AGindinson AGindinson dismissed stale reviews from alexbatashev and themself via e3ee3c9 June 24, 2021 07:58
@AGindinson AGindinson requested a review from bader June 24, 2021 07:58
Fixed comments.
@bader bader merged commit ec61222 into intel:sycl Jun 24, 2021
@JonChesterfield
Copy link
Contributor

If barrier(access::fence_space::local_space) maps onto the llvm fence instruction, you might be interested in the intrinsic __builtin_amdgcn_fence, https://reviews.llvm.org/D75917

@alexbatashev
Copy link
Contributor

If barrier(access::fence_space::local_space) maps onto the llvm fence instruction, you might be interested in the intrinsic __builtin_amdgcn_fence, https://reviews.llvm.org/D75917

__builtin_amdgcn_s_barrier is probably a better fit for barrier. The global problem, however, is that a lot of things are missing here.

@bader bader added the hip Issues related to execution on HIP backend. label Aug 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hip Issues related to execution on HIP backend.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is it possible for dpc++ to support amd rocm?