Skip to content

[SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision #13753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

DBDuncan
Copy link
Contributor

Add support to emit instructions that convert handles to images, samplers and sampled images

… extension new revision

Add support to emit instructions that convert handles to images, samplers and sampled images
@DBDuncan
Copy link
Contributor Author

There will be a corresponding SPIRV-LLVM-Translator pull request soon.

@DBDuncan
Copy link
Contributor Author

There will be a corresponding SPIRV-LLVM-Translator pull request soon.

Now up here: KhronosGroup/SPIRV-LLVM-Translator#2559

Needs to be merged first and then eventually pulled into DPC++. Then two tests in this PR should pass.

@DBDuncan DBDuncan marked this pull request as ready for review May 13, 2024 12:31
@DBDuncan DBDuncan requested review from a team as code owners May 13, 2024 12:31
@DBDuncan DBDuncan requested a review from aelovikov-intel May 13, 2024 12:31
// Image types used for generating SPIR-V
#ifdef __SYCL_DEVICE_ONLY__
template <int NDims>
using OCLImageTyRead =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part of the specification or should it be moved into the detail namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah it should be in the detail namespace. Moved it.

return __invoke__ImageReadLod<DataT>(imageHandle.raw_handle, coords, level);
return __invoke__ImageReadLod<DataT>(
CONVERT_HANDLE_TO_SAMPLED_IMAGE(imageHandle.raw_handle,
OCLImageTyRead<coordSize>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't you put OCLImageTyRead inside CONVERT_HANDLE_TO_SAMPLED_IMAGE? Or better yet inside __invoke__ImageReadLod...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that putting OCLImageTyRead inside CONVERT_HANDLE_TO_SAMPLED_IMAGE is better. But I am not sure about putting it inside __invoke__ImageReadLod. That would require also putting sampled_opencl_image_type inside __invoke__ImageReadLod

Because these functions are used to also create the PTX instructions for the CUDA backend I feel putting it inside __invoke__ImageReadLod would complicate things. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved OCLImageTyRead into CONVERT_HANDLE_TO_SAMPLED_IMAGE for now unless something different is decided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this file is outside SYCL RT codeownership, so the decision is ultimately yours :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave it as is for now. Potentially revisit this later.

@@ -230,6 +230,18 @@ template <typename SampledType, typename TempRetT, typename TempArgT>
extern __DPCPP_SYCL_EXTERNAL TempRetT __spirv_ImageSampleCubemap(SampledType,
TempArgT);

template <typename RetT>
extern __DPCPP_SYCL_EXTERNAL RetT
__spirv_ConvertHandleToImageINTEL(unsigned long);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned long is 32bit on windows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use following code?

template <class RetT, class HandleT>
extern __DPCPP_SYCL_EXTERNAL RetT __spirv_ConvertHandleToImageINTEL(HandleT);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added.

…side 'CONVERT_HANDLE_TO_SAMPLED_IMAGE' as sampled images only ever use read image type
Comment on lines 829 to 838
#if defined(__NVPTX__)
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageFetch<DataT>(raw_handle, coords)
#elif defined(__SPIR__)
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageRead<DataT>(raw_handle, coords)
#else
#define FETCH_UNSAMPLED_IMAGE(DataT, raw_handle, coords) \
__invoke__ImageFetch<DataT>(raw_handle, coords)
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these all the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. In the CUDA backend, __invoke__ImageFetch is used for a basic unsampled image read. When this is encountered by llvm-spirv, it crashes as it tries to emit, OpImageFetch, which is not implemented in llvm-spirv. But for a basic unsampled image read, we want to emit OpImageRead instead. So __invoke__ImageRead is used when compiling to SPIR-V.

Interestingly, and slightly annoyingly, OpImageFetch must always return a vector of four components. But that is prob not a big deal and is something to think about much later when adding support for fetching data from sampled images.

Also, turns out, llvm-spirv does not have an assert to check if an instruction is fully implemented before trying to call Inst->init(); at line 185 of SPIRVInstruction.h and jumping to random memory. That was a bit annoying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed one of the redundant branches.

@DBDuncan DBDuncan temporarily deployed to WindowsCILock May 14, 2024 15:09 — with GitHub Actions Inactive
@@ -10394,7 +10394,8 @@ static void getOtherSPIRVTransOpts(Compilation &C,
",+SPV_INTEL_fpga_argument_interfaces"
",+SPV_INTEL_fpga_invocation_pipelining_attributes"
",+SPV_INTEL_fpga_latency_control"
",+SPV_INTEL_task_sequence";
",+SPV_INTEL_task_sequence"
",+SPV_INTEL_bindless_images";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also update clang/test/Driver/sycl-spirv-ext.c to cover your new entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@DBDuncan
Copy link
Contributor Author

Friendly ping @aelovikov-intel

Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments about tests could be resolved in a different PR.


// Macros are required because it is not legal for a function to return
// a variable of type 'opencl_image_type'.
#if defined(__SPIR__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd personally merge the #ifs into a single one providing three macros at once. Can be ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, actually after combining them they did look better. Done.

Comment on lines +97 to +101
int main() {

sycl::device dev;
sycl::queue q(dev);
auto ctxt = q.get_context();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need any of this for this test. One can use SYCL_EXTERNAL and only run device compilation to produce much compact IR. See https://github.com/intel/llvm/blob/sycl/sycl/test/check_device_code/vector/vector_as.cpp for an example.

try {
q.submit([&](sycl::handler &cgh) {
cgh.parallel_for<image_addition>(1, [=](sycl::id<1> id) {
#ifdef __SYCL_DEVICE_ONLY__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can become troublesome if customers have to structure their code like this. In the worst case it can lead to different lambda captures between host/device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customers do not need to structure their code like this. I was just intending to invoke the new __spirv_ builtins manually to verify they worked. The guards are needed as opencl image types do not compile on host.

There is the other test, sycl/test/extensions/bindless_images.cpp which does also check that the SPIR-V instructions are emitted correctly, but it does not check that __spirv_ConvertHandleToSamplerINTEL works correctly as it is currently not used in the SYCL bindless extension. So I wanted to keep this to ensure all of the new builtins were covered.

@DBDuncan
Copy link
Contributor Author

DBDuncan commented May 22, 2024

Comments about tests could be resolved in a different PR.

Good idea. I will double check with my team. Should be fine.

@DBDuncan
Copy link
Contributor Author

Comments about tests could be resolved in a different PR.

Good idea. I will double check with my team. Should be fine.

@aelovikov-intel On your comment that I can use SYCL_EXTERNAL like in vector_as.cpp, we think it is best to leave it as is as we don't think it is worth it at the moment to update to use SYCL_EXTERNAL and all the filecheck checks.

If ok for you, can you approve and I will move forward to getting this merged.

MrSidims pushed a commit to KhronosGroup/SPIRV-LLVM-Translator that referenced this pull request May 24, 2024
The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753

(cherry picked from commit e9a9a3f)
@DBDuncan
Copy link
Contributor Author

@intel/llvm-gatekeepers Should be ready to merge. There are some failing tests but all of them look unrelated to the changes here.

@steffenlarsen
Copy link
Contributor

The issue causing no_sycl_hpp_in_e2e_tests.cpp have been addressed by #13889.

@steffenlarsen steffenlarsen merged commit 83db85f into intel:sycl May 24, 2024
11 of 14 checks passed
KanclerzPiotr pushed a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 7, 2025
…ion (KhronosGroup#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
KanclerzPiotr pushed a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 10, 2025
…ion (KhronosGroup#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
KanclerzPiotr added a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 10, 2025
The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753

Original PR KhronosGroup#2559
KanclerzPiotr added a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 10, 2025
…ion (KhronosGroup#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
MrSidims pushed a commit to KhronosGroup/SPIRV-LLVM-Translator that referenced this pull request Feb 10, 2025
…ion (#2559) (#3002)

The updated SPV_INTEL_bindless_images spec can be seen in this PR:
intel/llvm#13753
MrSidims pushed a commit to KhronosGroup/SPIRV-LLVM-Translator that referenced this pull request Feb 10, 2025
…ion (#2559) (#3008)

The updated SPV_INTEL_bindless_images spec can be seen in this PR:
intel/llvm#13753
KanclerzPiotr pushed a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 11, 2025
…#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
KanclerzPiotr added a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 11, 2025
…on (KhronosGroup#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
MrSidims pushed a commit to KhronosGroup/SPIRV-LLVM-Translator that referenced this pull request Feb 11, 2025
…ion (#2559) (#3009)

The updated SPV_INTEL_bindless_images spec can be seen in this PR:
intel/llvm#13753
KanclerzPiotr pushed a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 13, 2025
…#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
KanclerzPiotr added a commit to KanclerzPiotr/SPIRV-LLVM-Translator that referenced this pull request Feb 13, 2025
…on (KhronosGroup#2559)

The updated SPV_INTEL_bindless_images spec can be seen in this PR: intel/llvm#13753
MrSidims pushed a commit to KhronosGroup/SPIRV-LLVM-Translator that referenced this pull request Feb 13, 2025
…ion (#2559) (#3016)

The updated SPV_INTEL_bindless_images spec can be seen in this PR:
intel/llvm#13753
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants