Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

[SYCL] Fix tests using device version #1019

Open
wants to merge 12 commits into
base: intel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 0 additions & 36 deletions SYCL/Basic/info_ocl_version.cpp

This file was deleted.

9 changes: 4 additions & 5 deletions SYCL/GroupAlgorithm/SYCL2020/support.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,10 @@ bool isSupportedDevice(device D) {

if (PlatformName.find("OpenCL") != std::string::npos) {
std::string Version = D.get_info<info::device::version>();
size_t Offset = Version.find("OpenCL");
if (Offset == std::string::npos)
return false;
Version = Version.substr(Offset + 7, 3);
if (Version >= std::string("2.0"))

// Group collectives are mandatory in OpenCL 2.0 but optional in 3.0.
Version = Version.substr(7, 3);
if (Version >= "2.0" && Version < "3.0")
return true;
}

Expand Down
9 changes: 4 additions & 5 deletions SYCL/GroupAlgorithm/support.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,10 @@ bool isSupportedDevice(device D) {

if (PlatformName.find("OpenCL") != std::string::npos) {
std::string Version = D.get_info<sycl::info::device::version>();
size_t Offset = Version.find("OpenCL");
if (Offset == std::string::npos)
return false;
Version = Version.substr(Offset + 7, 3);
if (Version >= std::string("2.0"))

// Group collectives are mandatory in OpenCL 2.0 but optional in 3.0.
Version = Version.substr(7, 3);
if (Version >= "2.0" && Version < "3.0")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pennycook @steffenlarsen I had to update these as well, but it looks like these tests were never running on OpenCL.

Since the OpenCL plugin was trimming the version it always returned just the number without OpenCL in front, so I believe it would always get caught by the Offset == std::string::npos condition and exit.

As I understand it, the group collectives are mandatory starting in OpenCL 2.0 but they're optional again in OpenCL 3.0, and without using interop I don't think we have a way to know if the implementation supports them. But also I think these are mandatory in SYCL2020 so I'm not too sure how we should handle it.

So I think this check is slightly better than before, but we may need to revisit this later on, at the very least it should fix the issues showing up in the CI for the CPU OpenCL which is 3.0 and seemingly doesn't work with the group collectives right now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group algorithms and sub-groups are mandatory in SYCL.

I think we should somehow be checking for these missing features, so we can give a warning/error when somebody tries to use a backend that's missing certain things. @bashbaug might have some ideas about how to query things in this case. Using OpenCL interop might be the way to go.

return true;
}

Expand Down
13 changes: 12 additions & 1 deletion SYCL/SubGroup/helper.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -169,5 +169,16 @@ bool core_sg_supported(const device &Device) {
auto Vec = Device.get_info<info::device::extensions>();
if (std::find(Vec.begin(), Vec.end(), "cl_khr_subgroups") != std::end(Vec))
return true;
return Device.get_info<info::device::version>() >= "2.1";

if (Device.get_backend() == sycl::backend::opencl) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not internal to opencl plugin?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the OpenCL backend defines info::device::version as just passing through the whole version string from OpenCL, see:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that plugin would check native version and return if "cl_khr_subgroups" extension is supported or not.
Similar to here: https://github.com/intel/llvm/blob/9008a5d28110f0fb847907ea4c8d2d5fe7af702b/sycl/plugins/opencl/pi_opencl.cpp#L609

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what you mean, we could maybe do that, but I'm not sure it's quite correct, looking into it, I believe cl_khr_subgroups is now core in SYCL2020, so all devices should support it, and the only thing we could check in theory is if the sub-group size is 1, so maybe we could remove/simplify this a lot, but I'm not sure all the plugins already implement this correctly so it would need some testing.

Could we leave it as-is in this PR that's just changing the version number, and I'll look into updating that in a follow-up?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sure

// Extract the numerical version from the version string, OpenCL version
// string have the format "OpenCL <major>.<minor> <vendor specific data>".
std::string ver = Device.get_info<info::device::version>().substr(7, 3);

// cl_khr_subgroups was core in OpenCL 2.1 and 2.2, but went back to
// optional in 3.0
return ver >= "2.1" && ver < "3.0";
}

return false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to the device version stuff looks good to me, but I'm a bit confused -- do we currently return false for all the non-OpenCL backends? Or do the NVIDIA and AMD backends report support for cl_khr_subgroups in their extensions list?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the initial intent was for backends supporting it to return cl_khr_subgroups in their extension string. And I believe the extra version check for OpenCL is that this extension became core in OpenCL 2.1, so the devices stopped reporting the extension despite supporting it.

In theory we should probably report it from the Nvidia and AMD backends but I don't think we currently do. It's not ideal because since this is a runtime check it mostly looks like the tests are working even if they're disabled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that's pretty confusing. Thanks for explaining it.

I think long-term we should look to remove this check entirely. Sub-groups are a core feature of SYCL 2020 and should work everywhere. But I'm happy for that to be done as part of a separate PR.

}