-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][CUDA][HIP] Fix PI version reporting #5509
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This added information is definitely more informative than previously, but I wonder if it would be more in line with PI_DEVICE_INFO_VERSION
to report the compute capabilities of the CUDA device instead, i.e. cuDeviceGetAttribute
with CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR
and CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR
. See CL_DEVICE_VERSION
in clGetDeviceInfo
.
I don't know if HIP has a similar info query.
@npmiller there was a fail for OCL CPU job, which was caused by machine misconfiugration. I fixed that and restarted the job for you. Should be fine now. |
What the device version is meant to be is pretty confusing to me, it seems that it could be backend specific. For
But it does mention in that section that the meaning of any property can be redefined in the backend spec, and in the OpenCL backend spec it says the following:
Which also sounds a bit contradicting but suggests that it should be the highest OpenCL version reported there. And in fact AMD OpenCL devices report In OpenCL the same version property is defined as For level zero it's set to the LevelZero API version which is seemingly different from the LevelZero driver version. So all of that to say I'm really not sure what's the best thing to report here, I think the PI version makes quite a bit of sense when compared to the OpenCL backend, but it doesn't seem very helpful, compute capabilities would make a lot of sense but it doesn't seem like any other backend uses the device version that way. What do you think? As a side note, HIP does have similar compute capabilities version numbers but discourages using them, and advises to use the feature flags instead (AMD HIP API 4.5, section 5.39.2.6):
|
@npmiller - Thank you for the thorough analysis! I agree, the wording and intent of the info query is a little vague. I see what you mean, that it is hard to draw a comparison between the compute capability and the API versions reported by the other plugins. I am not opposed to reporting the PI version here, but my thinking is that reporting the compute capability gives the user more device-specific information. |
Additionally PI API version is available other ways, and reporting native device's API version is not. |
I see, yeah that makes sense it would be really helpful. I think it should be fine to just put the useful data in this field, I've updated it so that it gives the following output: CUDA plugin and HIP Nvidia plugin:
HIP AMD plugin:
It's going to lead to a bit of duplication on AMD with #5508 but I don't think that's a big deal. I had to update the function that was returning the device version string as all the plugins were going through the OpenCL path which trimmed the string we reported in the plugins down to just the numerical version. |
So after further discussions and reviewing the current SYCL specification, I've decided to file a ticket with the SYCL specification to seek clarifications on this: So I'll put this PR on hold until we get some clarifications on what this property is meant to be and what we're allowed to use it for. |
The SYCL specification PR has now been merged so now the device version is backend defined, so we should be fine to report the compute capabilities and define it as such in the CUDA backend specification. The CUDA backend specification is not finalized yet so this may need to be tweaked later on but it should be fine for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Updated to remove the extra processing on OpenCL version strings to extract just the version number, the full version can now be returned directly, deferring the formatting to the OpenCL spec. This is a change in behavior but the previous behavior was incorrect as it should have been the SYCL version returned not the OpenCL version. And is reflected in the spec changes of KhronosGroup/SYCL-Docs#231 |
/verify with intel/llvm-test-suite#1019 |
5935c4c
to
b2bca2c
Compare
b2bca2c
to
fbc2c8e
Compare
All tests failures are expected, and are fixed by: And for one of the AMD failure by: |
/verify with intel/llvm-test-suite#1019 |
ping @steffenlarsen @smaslov-intel Sorry I keep forgetting about this PR, are we good to merge this and the matching tests PR: intel/llvm-test-suite#1019 ? Or does it need more reviews and/or rebasing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! @npmiller - Could you please push a merge commit to make sure it works with tip.
/verify with intel/llvm-test-suite#1019 |
1 similar comment
/verify with intel/llvm-test-suite#1019 |
@npmiller - It looks like there are new failing tests. Could you please address these? |
/verify with intel/llvm-test-suite#1019 |
/verify with intel/llvm-test-suite#1019 |
3 similar comments
/verify with intel/llvm-test-suite#1019 |
/verify with intel/llvm-test-suite#1019 |
/verify with intel/llvm-test-suite#1019 |
@npmiller, please, update your branch and move tests from intel/llvm-test-suite#1019 to sycl/test-e2e. |
Report the actual PI version rather than `0.0`.
Use the Compute Capability for the device version for Nvidia GPUs, and use the architecture for AMD GPUs.
See discussions on: intel/llvm-test-suite#1019
2c036d8
to
02dc606
Compare
Report the actual PI version rather than
0.0
.