Skip to content

[SYCL] Extend broadcast to TriviallyCopyable types #2160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 31, 2020

Conversation

Pennycook
Copy link
Contributor

Uses the TriviallyCopyable shuffle approach for broadcasts.

Note that this works for both work-groups and sub-groups, because
OpGroupBroadcast is defined as supporting both groups in SPIR-V.

Signed-off-by: John Pennycook john.pennycook@intel.com


Closes #1885. It's become clear while implementing this functionality that some refactoring of spirv.hpp and group_algorithms.hpp would be nice, since there is a lot of code duplication to handle cases that are essentially identical except for different types of cast. I think such refactoring is beyond the scope of this PR, as the requirements of exactly which types should be supported and what casts are necessary is still evolving.

Uses the TriviallyCopyable shuffle approach for broadcasts.

Note that this works for both work-groups and sub-groups, because
OpGroupBroadcast is defined as supporting both groups in SPIR-V.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: John Pennycook <john.pennycook@intel.com>
@Pennycook Pennycook marked this pull request as ready for review July 22, 2020 21:27
@Pennycook Pennycook requested review from AlexeySachkov and a team as code owners July 22, 2020 21:27
@Pennycook Pennycook requested a review from vladimirlaz July 22, 2020 21:27
@Pennycook Pennycook added enhancement New feature or request spec extension All issues/PRs related to extensions specifications labels Jul 22, 2020
Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: John Pennycook <john.pennycook@intel.com>
@Pennycook
Copy link
Contributor Author

Resolved merge conflicts.

@Pennycook
Copy link
Contributor Author

@vladimirlaz, @AlexeySachkov: I think this one is ready for review now.

AlexeySachkov
AlexeySachkov previously approved these changes Jul 30, 2020
Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

One rather minor comment about helper function naming

Functionality is not specific to shuffles.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Resolve build failures from level0 changes.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Copy link
Contributor

@vladimirlaz vladimirlaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bader bader merged commit df6d715 into intel:sycl Jul 31, 2020
jsji pushed a commit that referenced this pull request Oct 5, 2023
…2160)

In some cases, we will see IR with the following

@__spirv_BuiltInGlobalInvocationId = external dso_local local_unnamed_addr addrspace(1) constant <3 x i64>, align 32

...

%0 = load <6 x i32>, ptr addrspace(1) @__spirv_BuiltInGlobalInvocationId, align 32
%1 = extractelement <6 x i32> %0, i64 0
Note the global type and load type are different. Change the handling of vector loads from vector globals to reconstruct the global vector type and then bitcast to the load type.

Thanks to @jcranmer-intel for helping me find the simplest solution.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@c7ec63a
Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025
Query out and use local size set in program IL in CL adapter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request spec extension All issues/PRs related to extensions specifications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support subgroup broadcast for pointers
4 participants