-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL][CUDA] Fix get_native interop for device #6649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the fix! Could you please add a test as well? |
I've updated the test for the experimental interface so it covers both the legacy and experimental CUDA interop interfaces. |
Non-blocking comment: Tagging @smaslov-intel and @bader for more opinions. |
There is already sycl/include/sycl/ext/oneapi/experimental/backend/cuda.hpp from where the specialization is being moved from. I don't feel strongly but think it'd be good to keep everything backend-specific isolated in their respectful support headers. @gmlueck what do you think from SYCL spec perspective? |
I agree. The spec says that an application must include the backend interop header in order to use the backend interop APIs. Therefore, it's better to declare the CUDA interop APIs in the CUDA backend interop header. |
So to clarify a bit, currently we have the two following headers regarding CUDA interop:
The first one implements the backend interop in the way described by the CUDA backend specification in the following PR KhronosGroup/SYCL-Docs#197, in the long term the hope is that once this specification makes it into the main SYCL spec, this experimental header would be moved to Currently these projects are not using this experimental header, but are using older "legacy" CUDA interop. This older CUDA interop interface was mostly implemented using the generic interop template functions so didn't really make use of the target specific interop header, in fact this And the second header listed above, is currently empty and marked as deprecated, for two reasons, first because it wasn't needed for the CUDA interop to work, and secondly because this header would in theory be reserved for a backend specified in the main SYCL specification, so in theory this file should have been in
So circling back to this patch, technically the proper solution here would be to create a new Alternatively, and that's what this patch is doing, we can keep the specialization in the common code, which will work for projects using this "legacy" CUDA interop interface, and we'll eventually clean it up when the official CUDA backend specification is finalized (which maybe should be noted in a I don't really mind doing either option so let me know which one you'd prefer, but be aware that moving it out of the common code will be breaking for other projects. |
So the "legacy" CUDA interop is currently undocumented? I assume we will want to deprecate and remove that legacy version once we promote the experimental API to the supported header at |
That's correct. Currently the only thing that conflicts between the two interfaces is the specialization of the types defined in the SYCL specification, these are already in separate type traits headers and currently users can switch between the two by using the
The "legacy" interface doesn't need anything else, except for the device So when we deprecate the legacy interface, with the way things are organized currently it should be fairly straightforward to keep it around and guard it with a macro similar as above. |
We should not require the macro to enable the "real" interface once it is available. The best outcome is that users disambiguate the version of |
I agree, this would be ideal, the current headers aren't setup for it but I could probably re-factor them to make that work. However, the current "legacy" interface doesn't require users to include any headers, so that would still be a breaking change for them.
Yes, that's what I meant, currently the macro is for the "real" interface but once it's ratified we could reverse it to have the macro for the "legacy" interface, and yes that is also a breaking change. Essentially with the way the "legacy" interface is setup we will need to do a breaking change to nicely deprecate it and support both. And so my initial point is essentially that maybe it makes sense to hold off on that breaking change until we're ready to deprecate the "legacy" interface and do both at once, that way we can encourage users to update at the same time. |
@npmiller - Precommit fails to build it. Could you please investigate why? |
21a1e11
I'll keep an eye on the run but the pre-commit should be fixed, the issue was having the CUDA template specialization when the CUDA backend traits are not included (meaning CUDA plugin disabled), and only seemed to show up on older gcc 7.x. So I simply guarded the template specialization with the same macros as the backend traits include. I've also added similar macro guard for template specializations for OpenCL and LevelZero that are also in this file. In local testing these would cause the same issues with older gcc but I suspect the CI always has these two plugins enabled so it would never trip the issue. |
Failure in OCL x64 is known and should be fixed with intel/llvm-test-suite#1228. I have re-approved, so let's see what Precommit says. |
Precommit has been appeased. @v-klochkov | @smaslov-intel | @gmlueck - It looks like the ongoing discussion was initially considered non-blocking. Is that still the case? If so I think this is ready to be merged, but maybe the discussion should be moved to a discussion thread? |
Yes, that makes sense to me. |
From my side, it is still non-blocking comment. It was added to raise a question and pay attention on that. |
This patch fixes: #6635
In #6483, the implementation of
get_native
for device for the CUDA plugin was mistakenly moved to the experimental interface header, and so it was no longer available for the regular interface, causing build issues.For the CUDA plugin there is currently two interfaces for the CUDA interop, the "legacy" one which is used by projects such as oneMKL and oneDNN, and the "experimental" one, defined in the
sycl/ext/oneapi/experimental/backend/cuda.hpp
header which implements the interop as described in the CUDA backend specification proposed here: KhronosGroup/SYCL-Docs#197