-
Notifications
You must be signed in to change notification settings - Fork 786
[SYCL] Add sycl_ext_named_sub_group_sizes kernel properties #12335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -95,9 +95,6 @@ struct property_value<work_group_size_hint_key, | |||
template <uint32_t Size> | |||
struct property_value<sub_group_size_key, | |||
std::integral_constant<uint32_t, Size>> { | |||
static_assert(Size != 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we need it if we want to re-use existing sub_group_size
property, but at the same time we are now allowing our users to set sub-group size to zero, which could be viewed as a user experience setback.
I don't think that it is of Critical or High importance, though, because sycl_ext_oneapi_kernel_properties
does not exactly says that the argument can't be zero and SYCL 2020 spec doesn't even say that it can't be a negative value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative is to do similar to automatic
and simply give it value max - 1
, then specialize it to set the value in the generated IR to 0. That way we get to keep this limitation, although it means another magic value the users should not explicitly use. Hopefully there's never a use-case for a 4294967295 or 4294967294 size sub-group. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this would be good to address. Using 0 as a magic value seems more likely to be hit by users than max and max-1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 3dc8731. I'll need to update the handling in translator, as 0 was treated as the special value to signal adding the named sub group size execution mode to the SPIRV module, but now that should be 4294967295 (= -1).
q.parallel_for<class Kernel2>(ndr, P2, [=](auto id) {}); | ||
} | ||
|
||
// CHECK: ![[SGSizeAttr]] = !{i32 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// CHECK: ![[SGSizeAttr]] = !{i32 0} | |
// CHECK: ![[SGSizeAttr]] = !{i32 0} | |
|
||
struct named_sub_group_size { | ||
static constexpr uint32_t primary = 0; | ||
static constexpr uint32_t automatic = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1
for an unsigned integer is an interesting value. Why did you go with that one? Also generally interesting to have this be a struct rather than an enum, but changing that would require an extension change, which could be done separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I thought there was kind of no choice with how the spec for the extension is defined right now.
My thought process is that since the spec defines the named sub group size properties as values of the sub_group_size
property, if, for example, we launch a kernel with the property sub_group_size_automatic
, this is the same as passing it sub_group_size<automatic>
. so automatic
(or primary
) cannot equal the common sub group sizes like 1
, 2
, 4
, etc. without interfering with the regular behavior of the sub_group_size
property. so I assigned primary to 0
(would not be a sensible sub group size) and automatic to -1
(as an unsigned integer, would be an unrealisticly sized sub group size).
Or did you just mean using a "negative" value for an unsigned value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I partially misunderstood the extension. I think it's fine, though I would personally prefer an unsigned value for this. Not blocking.
Spec: intel/llvm#11301 More accurately, this PR adds support for the named subgroup related features of SPV_INTEL_subgroup_requirements to support implementation of sycl_ext_named_sub_group_sizes (also see intel/llvm#12335). The features related to subgroup lane mapping are not added yet.
Spec: #11301 More accurately, this PR adds support for the named subgroup related features of SPV_INTEL_subgroup_requirements to support implementation of sycl_ext_named_sub_group_sizes (also see #12335). The features related to subgroup lane mapping are not added yet. Original commit: KhronosGroup/SPIRV-LLVM-Translator@43acfef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days. |
This pull request was closed because it has been stalled for 30 days with no activity. |
This PR adds the kernel properties required for sycl_ext_named_sub_group_sizes. When
sub_group_size_primary
is attached to a kernel, the kernel has its!intel_reqd_sub_group_size
contain the value -1. Whensub_group_size_automatic
is attached to a kernel, the kernel does not receive any!intel_reqd_sub_group_size
metadata. From an optional kernel features/sycl-post-link
perspective, these behaviors are aligned with the specification, allowingsub_group_size_automatic
marked kernels to be bundled with kernels with no attachedsub_group_size
property, while kernels marked withsub_group_size_primary
are bundled separately with respect to other kernels marked with a required sub group size.