-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][Doc] Add reduction properties extension #15213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
d349a86
[SYCL][Doc] Add reduction properties extension
Pennycook 3e6b79a
Bump SYCL 2020 revision
Pennycook 9cbcd38
Fix namespaces in usage example
Pennycook de8c5a7
Clarify deterministic behavior
Pennycook a33abc4
Clarify conditions that must hold for determinism
Pennycook 05d0cb4
Clarify determinism requirements further
Pennycook File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
246 changes: 246 additions & 0 deletions
246
sycl/doc/extensions/proposed/sycl_ext_oneapi_reduction_properties.asciidoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
= sycl_ext_oneapi_reduction_properties | ||
|
||
:source-highlighter: coderay | ||
:coderay-linenums-mode: table | ||
|
||
// This section needs to be after the document title. | ||
:doctype: book | ||
:toc2: | ||
:toc: left | ||
:encoding: utf-8 | ||
:lang: en | ||
:dpcpp: pass:[DPC++] | ||
:endnote: —{nbsp}end{nbsp}note | ||
|
||
// Set the default source code type in this document to C++, | ||
// for syntax highlighting purposes. This is needed because | ||
// docbook uses c++ and html5 uses cpp. | ||
:language: {basebackend@docbook:c++:cpp} | ||
|
||
|
||
== Notice | ||
|
||
[%hardbreaks] | ||
Copyright (C) 2024 Intel Corporation. All rights reserved. | ||
|
||
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks | ||
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by | ||
permission by Khronos. | ||
|
||
|
||
== Contact | ||
|
||
To report problems with this extension, please open a new issue at: | ||
|
||
https://github.com/intel/llvm/issues | ||
|
||
|
||
== Dependencies | ||
|
||
This extension is written against the SYCL 2020 revision 9 specification. All | ||
references below to the "core SYCL specification" or to section numbers in the | ||
SYCL specification refer to that revision. | ||
|
||
This extension also depends on the following other SYCL extensions: | ||
|
||
* link:../experimental/sycl_ext_oneapi_properties.asciidoc[ | ||
sycl_ext_oneapi_properties] | ||
|
||
|
||
== Status | ||
|
||
This is a proposed extension specification, intended to gather community | ||
feedback. Interfaces defined in this specification may not be implemented yet | ||
or may be in a preliminary state. The specification itself may also change in | ||
incompatible ways before it is finalized. *Shipping software products should | ||
not rely on APIs defined in this specification.* | ||
|
||
|
||
== Overview | ||
|
||
In order to maximize portability across different device types, the SYCL 2020 | ||
`reduction` interface gives implementers a significant amount of freedom in | ||
selecting the correct reduction algorithm to use for different types. | ||
|
||
In the majority of cases, a developer can trust an implementation to choose the | ||
best algorithm; however, there are situations in which a user may wish to | ||
constrain algorithm selection (e.g., to ensure run-to-run reproducibility). | ||
This extension introduces new compile-time properties for the `reduction` | ||
interface that enable developers to provide such constraints. | ||
|
||
|
||
== Specification | ||
|
||
=== Feature test macro | ||
|
||
This extension provides a feature-test macro as described in the core SYCL | ||
specification. An implementation supporting this extension must predefine the | ||
macro `SYCL_EXT_ONEAPI_REDUCTION_PROPERTIES` to one of the values defined in | ||
the table below. Applications can test for the existence of this macro to | ||
determine if the implementation supports this feature, or applications can test | ||
the macro's value to determine which of the extension's features the | ||
implementation supports. | ||
|
||
[%header,cols="1,5"] | ||
|=== | ||
|Value | ||
|Description | ||
|
||
|1 | ||
|Initial version of this extension. | ||
|=== | ||
|
||
=== `reduction` overload | ||
|
||
New `reduction` overloads are introduced to allow developers to attach | ||
compile-time properties to a reduction object. | ||
|
||
Each new overload has the same behavior as its corresponding definition in the | ||
SYCL 2020 specification unless the definition of a property passed in via the | ||
final `sycl::ext::oneapi::experimental::properties` parameter says otherwise. | ||
|
||
[source,c++] | ||
---- | ||
namespace sycl { | ||
|
||
template <typename BufferT, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ reduction(BufferT vars, handler& cgh, BinaryOperation combiner, | ||
PropertyList properties); | ||
|
||
template <typename T, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ reduction(T* var, BinaryOperation combiner, | ||
PropertyList properties); | ||
|
||
template <typename T, typename Extent, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ reduction(span<T, Extent> vars, BinaryOperation combiner, | ||
PropertyList properties); | ||
|
||
template <typename BufferT, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ | ||
reduction(BufferT vars, handler& cgh, const BufferT::value_type& identity, | ||
BinaryOperation combiner, PropertyList properties); | ||
|
||
template <typename T, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ reduction(T* var, const T& identity, BinaryOperation combiner, | ||
PropertyList properties); | ||
|
||
template <typename T, typename Extent, typename BinaryOperation, typename PropertyList> | ||
__unspecified__ reduction(span<T, Extent> vars, const T& identity, | ||
BinaryOperation combiner, | ||
PropertyList properties); | ||
|
||
} | ||
---- | ||
|
||
=== Reduction properties | ||
|
||
New `reduction` properties are introduced to allow developers to constrain | ||
reduction algorithm selection based on desired behavior(s). Compile-time | ||
properties corresponding to existing runtime properties are also introduced to | ||
ensure that all information can be passed via a single property list. | ||
|
||
If a reduction kernel is submitted to a device that cannot satisfy the | ||
request for specific reduction behavior(s), the implementation must throw an | ||
`exception` with the `errc::feature_not_supported` error code. | ||
|
||
[source,c++] | ||
---- | ||
namespace sycl::ext::oneapi { | ||
|
||
struct deterministic_key { | ||
using value_t = property_value<deterministic_key>; | ||
}; | ||
inline constexpr deterministic_key::value_t deterministic; | ||
|
||
struct initialize_to_identity_key { | ||
using value_t = property_value<initialize_to_identity_key>; | ||
}; | ||
inline constexpr initialize_to_identity_key::value_t initialize_to_identity; | ||
|
||
} | ||
---- | ||
|
||
|=== | ||
|Property|Description | ||
|
||
|`deterministic` | ||
a|When two reductions both have this property, they are guaranteed to produce | ||
the same result when all of the following conditions hold: | ||
|
||
* Both reductions run on the same device. | ||
* Both reductions are invoked with the same launch configuration (i.e., `range` | ||
or `nd_range`). | ||
* The same values are contributed to each reduction. | ||
* The work-items in each reduction contribute those values in the same pattern | ||
and the same order. For example, if the first reduction contributes values | ||
_V1_, _V2_, and _V3_ (in that order) from a work-item with linear index _i_; | ||
then the second reduction must also contribute values _V1_, _V2_, and _V3_ | ||
(in that order) from the work-item with linear index _i_. | ||
|
||
[_Note:_ Work-items may contribute different values to a reduction because of | ||
other potential sources of non-determinism, such as calls to group algorithms, | ||
use of atomic operations, etc. _{endnote}_] | ||
|
||
|`initialize_to_identity` | ||
|Adds the same requirement as | ||
`sycl::property::reduction::initialize_to_identity`. | ||
|
||
|=== | ||
|
||
|
||
=== Usage example | ||
|
||
[source,c++] | ||
---- | ||
using syclex = sycl::ext::oneapi::experimental; | ||
|
||
float sum(sycl::queue q, float* input, size_t N) { | ||
|
||
float result = 0; | ||
{ | ||
sycl::buffer<float> buf{&result, 1}; | ||
|
||
q.submit([&](sycl::handler& h) { | ||
auto reduction = sycl::reduction(buf, h, sycl::plus<>(), syclex::properties(syclex::deterministic)); | ||
h.parallel_for(N, reduction, [=](size_t i, auto& reducer) { | ||
reducer += input[i]; | ||
}); | ||
} | ||
} | ||
return result; | ||
|
||
} | ||
|
||
... | ||
|
||
float x = sum(q, array, 1024); | ||
float y = sum(q, array, 1024); | ||
|
||
// NB: determinism guarantees bitwise reproducible reductions for floats | ||
assert(sycl::bit_cast<unsigned int>(x) == sycl::bit_cast<unsigned int>(y)); | ||
---- | ||
|
||
|
||
== Implementation notes | ||
|
||
This non-normative section provides information about one possible | ||
implementation of this extension. It is not part of the specification of the | ||
extension's API. | ||
|
||
Since SYCL implementations must support arbitrary types, we anticipate that | ||
many implementations will already have appropriate reduction variants available | ||
that satisfy the constraints imposed by these new properties. Implementing | ||
support for these new constraints may therefore be as straightforward as | ||
providing a new overload of `sycl::reduction` that overrides the algorithm | ||
selection process. | ||
|
||
The steps necessary to guarantee deterministic results are type-dependent. For | ||
integers and built-in combination operators, all implementations should be | ||
deterministic by default. For floating-point numbers and/or custom combination | ||
operators, it becomes necessary to ensure that the intermediate results from | ||
each work-item are always combined in the same order. | ||
|
||
|
||
== Issues | ||
|
||
None. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.