-
Notifications
You must be signed in to change notification settings - Fork 772
[SYCL] Extend broadcast to TriviallyCopyable types #2160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Uses the TriviallyCopyable shuffle approach for broadcasts. Note that this works for both work-groups and sub-groups, because OpGroupBroadcast is defined as supporting both groups in SPIR-V. Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: John Pennycook <john.pennycook@intel.com>
Resolved merge conflicts. |
@vladimirlaz, @AlexeySachkov: I think this one is ready for review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
One rather minor comment about helper function naming
Functionality is not specific to shuffles. Signed-off-by: John Pennycook <john.pennycook@intel.com>
Resolve build failures from level0 changes. Signed-off-by: John Pennycook <john.pennycook@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…2160) In some cases, we will see IR with the following @__spirv_BuiltInGlobalInvocationId = external dso_local local_unnamed_addr addrspace(1) constant <3 x i64>, align 32 ... %0 = load <6 x i32>, ptr addrspace(1) @__spirv_BuiltInGlobalInvocationId, align 32 %1 = extractelement <6 x i32> %0, i64 0 Note the global type and load type are different. Change the handling of vector loads from vector globals to reconstruct the global vector type and then bitcast to the load type. Thanks to @jcranmer-intel for helping me find the simplest solution. Original commit: KhronosGroup/SPIRV-LLVM-Translator@c7ec63a
Query out and use local size set in program IL in CL adapter.
Uses the TriviallyCopyable shuffle approach for broadcasts.
Note that this works for both work-groups and sub-groups, because
OpGroupBroadcast is defined as supporting both groups in SPIR-V.
Signed-off-by: John Pennycook john.pennycook@intel.com
Closes #1885. It's become clear while implementing this functionality that some refactoring of
spirv.hpp
andgroup_algorithms.hpp
would be nice, since there is a lot of code duplication to handle cases that are essentially identical except for different types of cast. I think such refactoring is beyond the scope of this PR, as the requirements of exactly which types should be supported and what casts are necessary is still evolving.