Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][DOC][CUDA][HIP] Update getStartedGuide.md #10669

Merged
merged 12 commits into from
Aug 28, 2023

Conversation

JackAKirk
Copy link
Contributor

Some instructions for building/running/using HIP and CUDA backends was out of date. This PR updates them.

Note that I think there is scope for further improvement of this doc generally. This PR makes improvements only wrt the CUDA/HIP backends.

JackAKirk added 2 commits August 2, 2023 17:54
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk marked this pull request as ready for review August 3, 2023 14:01
@JackAKirk JackAKirk requested a review from a team as a code owner August 3, 2023 14:01
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content looks good to me. Mostly formatting suggestions.

Comment on lines 8 to 9
- [Getting Started with oneAPI DPC++](#getting-started-with-oneapi-dpc)
- [Table of contents](#table-of-contents)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these two items are useful. I guess we intentionally skipped them.

What is the reason for changing * with -?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I've removed them.

I didn't notice the * -> - change, I think the lint did this. I've changed it back.

Comment on lines 196 to 206
The CUDA backend should work on Windows or Linux operating systems with any
GPU compatible with SM 50 or above. The default SM for the NVIDIA CUDA backend
is 5.0. Users can specify lower values, but some features may not be supported.
GPU with compute capability (SM version) sm_50 or above. The default
SM version for the NVIDIA CUDA backend is sm_50. Users of sm_3X devices can
attempt to specify the target architecture [ahead of time](#aot-target-architectures),
provided that they use a 11.X CUDA Runtime version, but some features may
not be supported. The CUDA backend has been tested with
different Ubuntu linux distributions and a selection of supported CUDA toolkit versions
and GPUs. The backend is tested by a relevant device/toolkit prior to a oneapi plugin release.
Go to the plugin release [pages](https://developer.codeplay.com/products/oneapi/nvidia/)
for further details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The CUDA backend should work on Windows or Linux operating systems with any
GPU compatible with SM 50 or above. The default SM for the NVIDIA CUDA backend
is 5.0. Users can specify lower values, but some features may not be supported.
GPU with compute capability (SM version) sm_50 or above. The default
SM version for the NVIDIA CUDA backend is sm_50. Users of sm_3X devices can
attempt to specify the target architecture [ahead of time](#aot-target-architectures),
provided that they use a 11.X CUDA Runtime version, but some features may
not be supported. The CUDA backend has been tested with
different Ubuntu linux distributions and a selection of supported CUDA toolkit versions
and GPUs. The backend is tested by a relevant device/toolkit prior to a oneapi plugin release.
Go to the plugin release [pages](https://developer.codeplay.com/products/oneapi/nvidia/)
for further details.
The CUDA backend should work on Windows or Linux operating systems with any GPU
with compute capability (SM version) sm_50 or above. The default SM version for
the NVIDIA CUDA backend is sm_50. Users of sm_3X devices can attempt to specify
the target architecture [ahead of time](#aot-target-architectures), provided
that they use a 11.X CUDA Runtime version, but some features may not be
supported. The CUDA backend has been tested with different Ubuntu Linux
distributions and a selection of supported CUDA toolkit versions and GPUs. The
backend is tested by a relevant device/toolkit prior to a oneAPI plugin release.
Go to the plugin release
[pages](https://developer.codeplay.com/products/oneapi/nvidia/) for further
details.
  1. Use 80-char line limit.
  2. linux -> Linux
  3. oneapi -> oneAPI

sycl/doc/GetStartedGuide.md Outdated Show resolved Hide resolved
sycl/doc/GetStartedGuide.md Outdated Show resolved Hide resolved
sycl/doc/GetStartedGuide.md Outdated Show resolved Hide resolved
sycl/doc/GetStartedGuide.md Outdated Show resolved Hide resolved
sycl/doc/GetStartedGuide.md Outdated Show resolved Hide resolved
JackAKirk and others added 8 commits August 21, 2023 15:04
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Use 80-char line limit.
    linux -> Linux
    oneapi -> oneAPI

Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
Use 80-char line limit.
    linux -> Linux
    oneapi -> oneAPI

Co-authored-by: Alexey Bader <alexey.bader@intel.com>
- -> *

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
    linux -> Linux
    oneapi -> oneAPI

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@@ -179,7 +179,7 @@ CUDA backend has Windows support; Windows Subsystem for Linux (WSL) is not
needed to build and run the CUDA backend.

Enabling this flag requires an installation of at least
[CUDA 10.2](https://developer.nvidia.com/cuda-10.2-download-archive) on
[CUDA 11.5](https://developer.nvidia.com/cuda-11-5-0-download-archive) on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to 11.0? Our plugins are aimed to target 11.0 so would be good if we had the same support in open source

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah is it definitely working now for 11.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it work for e.g. 10.2 or even earlier? Ideally we can support as many cuda versions as possible. It's been pointed out many times that clusters don't update their versions for long periods e.g. #10858 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as we keep intermittently checking that CUDA 10.2 is working we can keep this support. But the problem is that we don't have any CI for CUDA 10.2 whereas we do have CI for CUDA 11.0 (to test the releases internally)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are version checks for above/below 10.2 here:

// CU_POINTER_ATTRIBUTE_RANGE_START_ADDR was introduced in CUDA 10.2

Maybe with the right checks it < 10.2 could work fine. Or maybe it already works.

Copy link
Contributor

@hdelan hdelan Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that it already works. But we don't have CI to check every commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intel/llvm CI started with 11.4 version.
We switched to 11.7 due to build issues of SYCL-CTS. #6545
Today we validate each commit with 12.1 version, but I unfortunately, I didn't record the reason for uplift from 11.7.

I would say that it already works. But we don't have CI to check every commit

Do we need to use 10.2 in CI instead of 12.1?
IIRC, all previous uplifts were agreed with @AerialMantis.

Copy link
Contributor Author

@JackAKirk JackAKirk Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that from the next release we are meant to be supporting 12.1, so it makes sense the CI tests that as the priority. It would be nice to keep track of whether older releases still work, so that systems that don't regularly update their toolkit can still stay supported; but I think that is more on codeplay to test from time to time, and when introducing new features that might break older versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dug through some older commits and I think that versions older than 10.2 should work for appropriate devices.
We can't easily test it since we don't have a system where 10.x or older versions are installed.
I've updated the doc accordingly.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk requested a review from hdelan August 25, 2023 14:02
@JackAKirk
Copy link
Contributor Author

@intel/llvm-gatekeepers could you please merge this?

@dm-vodopyanov dm-vodopyanov merged commit 3cec71f into intel:sycl Aug 28, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants