-
Notifications
You must be signed in to change notification settings - Fork 131
[SYCL] Fix tests using device version #1019
base: intel
Are you sure you want to change the base?
Conversation
We still need to test |
So For the other plugins the specific format, if any will need to be defined in the backend specs, there is one being worked on for the CUDA backend but it's not yet finalized and for the other backends they either don't have a backend spec, or don't define the format of this property yet so I'm not sure we can properly test this at this point, beyond what |
@npmiller : This appears stale. Do you want to update or close this? |
The device version property can now be defined differently for different plugins so this test is only valid for the OpenCL plugin.
Good catch, I forgot about this, I've rebased it and the associated intel/llvm PR: |
@@ -169,5 +169,14 @@ bool core_sg_supported(const device &Device) { | |||
auto Vec = Device.get_info<info::device::extensions>(); | |||
if (std::find(Vec.begin(), Vec.end(), "cl_khr_subgroups") != std::end(Vec)) | |||
return true; | |||
return Device.get_info<info::device::version>() >= "2.1"; | |||
|
|||
if (Device.get_backend() == sycl::backend::opencl) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this not internal to opencl plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the OpenCL backend defines info::device::version
as just passing through the whole version string from OpenCL, see:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that plugin would check native version and return if "cl_khr_subgroups" extension is supported or not.
Similar to here: https://github.com/intel/llvm/blob/9008a5d28110f0fb847907ea4c8d2d5fe7af702b/sycl/plugins/opencl/pi_opencl.cpp#L609
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see what you mean, we could maybe do that, but I'm not sure it's quite correct, looking into it, I believe cl_khr_subgroups
is now core in SYCL2020, so all devices should support it, and the only thing we could check in theory is if the sub-group size is 1
, so maybe we could remove/simplify this a lot, but I'm not sure all the plugins already implement this correctly so it would need some testing.
Could we leave it as-is in this PR that's just changing the version number, and I'll look into updating that in a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, sure
@Pennycook - Friendly ping. |
return ver.substr(7, 3) >= "2.1"; | ||
} | ||
|
||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to the device version stuff looks good to me, but I'm a bit confused -- do we currently return false for all the non-OpenCL backends? Or do the NVIDIA and AMD backends report support for cl_khr_subgroups in their extensions list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the initial intent was for backends supporting it to return cl_khr_subgroups
in their extension string. And I believe the extra version check for OpenCL is that this extension became core in OpenCL 2.1, so the devices stopped reporting the extension despite supporting it.
In theory we should probably report it from the Nvidia and AMD backends but I don't think we currently do. It's not ideal because since this is a runtime check it mostly looks like the tests are working even if they're disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that's pretty confusing. Thanks for explaining it.
I think long-term we should look to remove this check entirely. Sub-groups are a core feature of SYCL 2020 and should work everywhere. But I'm happy for that to be done as part of a separate PR.
|
||
// Group collectives are mandatory in OpenCL 2.0 but optional in 3.0. | ||
Version = Version.substr(7, 3); | ||
if (Version >= "2.0" && Version < "3.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pennycook @steffenlarsen I had to update these as well, but it looks like these tests were never running on OpenCL.
Since the OpenCL plugin was trimming the version it always returned just the number without OpenCL
in front, so I believe it would always get caught by the Offset == std::string::npos
condition and exit.
As I understand it, the group collectives are mandatory starting in OpenCL 2.0 but they're optional again in OpenCL 3.0, and without using interop I don't think we have a way to know if the implementation supports them. But also I think these are mandatory in SYCL2020 so I'm not too sure how we should handle it.
So I think this check is slightly better than before, but we may need to revisit this later on, at the very least it should fix the issues showing up in the CI for the CPU OpenCL which is 3.0 and seemingly doesn't work with the group collectives right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Group algorithms and sub-groups are mandatory in SYCL.
I think we should somehow be checking for these missing features, so we can give a warning/error when somebody tries to use a backend that's missing certain things. @bashbaug might have some ideas about how to query things in this case. Using OpenCL interop might be the way to go.
The device version property can now be defined differently for different
backends, and for OpenCL it is now deferred to the OpenCL property so the formatting is not guaranteed by the SYCL spec.
see:
This PR the test failures in intel/llvm#5509