-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][Doc] Add SYCL_INTEL_FPGA_data_flow_pipes_properties extension #5839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,280 @@ | ||||||
= SYCL_INTEL_FPGA_data_flow_pipes_properties | ||||||
|
||||||
:source-highlighter: coderay | ||||||
:coderay-linenums-mode: table | ||||||
|
||||||
// This section needs to be after the document title. | ||||||
:doctype: book | ||||||
:toc2: | ||||||
:toc: left | ||||||
:encoding: utf-8 | ||||||
:lang: en | ||||||
|
||||||
:blank: pass:[ +] | ||||||
|
||||||
// Set the default source code type in this document to C++, | ||||||
// for syntax highlighting purposes. This is needed because | ||||||
// docbook uses c++ and html5 uses cpp. | ||||||
:language: {basebackend@docbook:c++:cpp} | ||||||
|
||||||
// This is necessary for asciidoc, but not for asciidoctor | ||||||
:cpp: C++ | ||||||
|
||||||
== Introduction | ||||||
IMPORTANT: This specification is a draft. | ||||||
|
||||||
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are | ||||||
trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. | ||||||
used by permission by Khronos. | ||||||
|
||||||
NOTE: This document is better viewed when rendered as html with asciidoctor. | ||||||
GitHub does not render image icons. | ||||||
|
||||||
This document describes an extension that adds compile-time constant properties | ||||||
and static member functions to pipes. | ||||||
|
||||||
== Notice | ||||||
|
||||||
Copyright (c) 2022 Intel Corporation. All rights reserved. | ||||||
|
||||||
== Status | ||||||
|
||||||
Working Draft | ||||||
|
||||||
This is a preview extension specification, intended to provide early access to | ||||||
a feature for review and community feedback. When the feature matures, this | ||||||
specification may be released as a formal extension. | ||||||
|
||||||
Because the interfaces defined by this specification are not final and are | ||||||
subject to change they are not intended to be used by shipping software | ||||||
products. | ||||||
|
||||||
== Version | ||||||
|
||||||
Built On: {docdate} + | ||||||
Revision: A | ||||||
|
||||||
== Contact | ||||||
|
||||||
Peter Colberg, Intel (peter 'dot' colberg 'at' intel 'dot' com) | ||||||
|
||||||
== Contributors | ||||||
|
||||||
Bo Lei, Intel + | ||||||
Marco Jacques, Intel + | ||||||
Joe Garvey, Intel + | ||||||
Aditi Kumaraswamy, Intel + | ||||||
Robert Ho, Intel + | ||||||
Sherry Yuan, Intel + | ||||||
Peter Colberg, Intel | ||||||
|
||||||
== Dependencies | ||||||
|
||||||
This extension is written against the SYCL 2020 specification, revision 3. | ||||||
|
||||||
It also depends on the `SYCL_INTEL_data_flow_pipes` and | ||||||
`sycl_ext_oneapi_properties` extensions. | ||||||
|
||||||
== Overview | ||||||
|
||||||
This extension introduces properties that establish differences in the | ||||||
implementation of `sycl::pipe`. These properties are FPGA specific. An example | ||||||
of the syntax can be seen below. | ||||||
|
||||||
[source,c++] | ||||||
---- | ||||||
using pipe = pipe<class some_pipe, int, decltype(properties{uses_ready<true>})>; | ||||||
---- | ||||||
|
||||||
== Feature test macro | ||||||
|
||||||
This extension provides a feature-test macro as described in the core SYCL | ||||||
specification section 6.3.3 "Feature test macros". Therefore, an implementation | ||||||
supporting this extension must predefine the macro | ||||||
`SYCL_EXT_INTEL_FPGA_PIPE_PROPERTIES` to one of the values defined in the table | ||||||
below. Applications can test for the existence of this macro to determine if | ||||||
the implementation supports this feature, or applications can test the macro's | ||||||
value to determine which of the extension's APIs the implementation supports. | ||||||
|
||||||
[%header,cols="1,5"] | ||||||
|=== | ||||||
|Value |Description | ||||||
|1 |Initial extension version. Base features are supported. | ||||||
|=== | ||||||
|
||||||
=== Pipe properties | ||||||
|
||||||
Below is a list of compile-time-constant properties which `pipe` supports. | ||||||
|
||||||
```c++ | ||||||
namespace sycl::ext::intel::experimental { | ||||||
|
||||||
struct min_capacity { | ||||||
template<int Capacity> | ||||||
using value_t = property_value<min_capacity, std::integral_constant<int, Capacity>>; | ||||||
}; | ||||||
|
||||||
struct ready_latency { | ||||||
template<int Latency> | ||||||
using value_t = property_value<ready_latency, std::integral_constant<int, Latency>>; | ||||||
}; | ||||||
|
||||||
struct bits_per_symbol { | ||||||
template<int Bits> | ||||||
using value_t = property_value<bits_per_symbol, std::integral_constant<int, Bits>>; | ||||||
}; | ||||||
|
||||||
struct uses_valid { | ||||||
template<bool Valid> | ||||||
using value_t = property_value<uses_valid, std::integral_constant<bool, Valid>>; | ||||||
}; | ||||||
|
||||||
struct uses_ready { | ||||||
template<bool Ready> | ||||||
using value_t = property_value<uses_ready, std::integral_constant<bool, Ready>>; | ||||||
}; | ||||||
|
||||||
struct in_csr { | ||||||
template<bool Csr> | ||||||
using value_t = property_value<in_csr, std::integral_constant<bool, Csr>>; | ||||||
}; | ||||||
|
||||||
Comment on lines
+137
to
+141
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we potentially use the same in_csr property defined in device global? Given the behaviour will be similar There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't that mean that the device_global extension needs to be enabled to use this property? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we could define the property in both extensions, so you could use it with either one. You would not need to enable the device global extension just to use the property. Note that there is a proposal to rename the "implement_in_csr" property like this:
Does that new spelling still make sense for pipes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it makes sense, and it also makes this property more extensible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "dedicated" isn't quite appropriate for host pipes. We can and will share a single hardware pipe among multiple logical pipes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nothing comes to mind immediately (but that doesn't mean there isn't one that makes sense for both). @GarveyJoe do you have any ideas? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding myself There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "conduit" or "streaming" could both work. Alternatively, we could go back to it being a bool. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could see either of those for pipe, but it's less clear to me they're descriptive of device_global (@artemrad ?) For bool, how confident are we won't eventually want another option? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should in_csr be a separate protocol? It isn't an avalon-st interface anymore. It's closer to an avalon-mm interface. Some of the protocol-dependent properties don't even make sense for an "in_csr" pipe such as ready_latency. |
||||||
struct first_symbol_in_high_order_bits { | ||||||
template<bool HighOrder> | ||||||
using value_t = property_value<first_symbol_in_high_order_bits, std::integral_constant<bool, HighOrder>>; | ||||||
}; | ||||||
|
||||||
struct protocol { | ||||||
enum class protocol_name { | ||||||
avalon, | ||||||
}; | ||||||
|
||||||
template<protocol_name Protocol> | ||||||
using value_t = property_value<protocol, std::integral_constant<protocol_name, Protocol>>; | ||||||
}; | ||||||
|
||||||
} // namespace sycl::ext::intel::experimental | ||||||
``` | ||||||
|
||||||
-- | ||||||
[options="header"] | ||||||
|==== | ||||||
| Property | Description | ||||||
|`min_capacity` | ||||||
| Valid values: Non-negative integer value. | ||||||
|
||||||
Default value: 0 | ||||||
|
||||||
User defined minimum number of words in units of data type size that the pipe | ||||||
must be able to store without any being read out. A minimum capacity is required | ||||||
in some algorithms to avoid deadlock, or for performance tuning. An | ||||||
implementation can include more capacity than this parameter, but not less. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand how this can be ignored for inter-kernel pipes. As you say, the algorithm may depend on a certain minimum capacity. In any event, isn't it always assumed that a compiler can optimize away code if it can prove that doing so has no effect? I don't think we need to say that in the spec if that's all we mean. |
||||||
|
||||||
|
||||||
|`ready_latency` | ||||||
| Valid values: Non-negative integer value. | ||||||
|
||||||
Default value: 0 | ||||||
|
||||||
The number of cycles between when the ready signal is deasserted and when the | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a looser definition than that in the avalon streaming specification. Specifically, the avalon spec also mandates that the upstream source cannot assert valid until ready_latency cycles after the downstream source has asserted ready. Is this intended? If we intend to mimic the avalon spec exactly should we change this to: "The ready_latency property of the stream as defined in the Avalon-RT specification"? |
||||||
pipe can no longer accept new inputs. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we remove min_capacity from the list of properties than this applies to all properties. Perhaps we should factor it out of the table. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regardless of whether or not we factor it out, I think we need to tweak this wording to make it clearer when the properties can be ignored. We could say something like: "On FPGA targets this property provides control over the hardware interface built for the pipe at the boundary of a device image. If both endpoints of this pipe reside in the same device_image then this property is meaningless and can be ignored. On non-FPGA targets this property is meaningless and can be ignored. " |
||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`bits_per_symbol` | ||||||
| Valid values: A positive integer value that evenly divides by the data type size. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't this say:
Suggested change
My understanding is that you can have multiple symbols per element in the pipe but not multiple pipe elements per symbol. The most common case is that the bits_per_symbol is 8, i.e. one symbol is one byte. |
||||||
|
||||||
Default value: Datatype size | ||||||
|
||||||
Describes how the data is broken into symbols on the data bus. | ||||||
|
||||||
Data is broken down according to how you set the first_symbol_in_high_order_bits | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we need to mention first_symbol_in_high_order_bits here. It already has its own definition further down and mentioning it here doesn't make the definition of bits_per_symbol any more clear. If you extended this with an example that showed how the two properties interacted that might be helpful but, as is, this sentence contributes nothing. |
||||||
property. By default, data is broken down in little endian order. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`uses_valid` | ||||||
| Valid values: true or false | ||||||
|
||||||
Default value: true | ||||||
|
||||||
Controls whether a valid signal is present on the pipe interface. If false, the | ||||||
upstream source must provide valid data on every cycle that ready is asserted. | ||||||
|
||||||
This is equivalent to changing the pipe read calls to tryRead and assuming that | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no API named "tryRead"; the non-blocking version is also named "read". It merely has different function arguments. Also, the argument is named "success_code" not "success". Instead, let's say: "This is equivalent to changing the pipe read calls to non-blocking reads and assuming that success_code is always true." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current API is unclear and confusing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||
success is always true. | ||||||
|
||||||
If set to false, min_capacity and ready_latency must be 0. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`uses_ready` | ||||||
| Valid values: true or false | ||||||
|
||||||
Default value: true | ||||||
|
||||||
Controls whether a ready signal is present. If false, the downstream sink must | ||||||
be able to accept data on every cycle that valid is asserted. This is | ||||||
equivalent to changing the pipe read calls to tryWrite and assuming that success | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
is always true. | ||||||
|
||||||
If set to false, ready_latency must be 0. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`in_csr` | ||||||
| Valid Values: true or false | ||||||
|
||||||
Default Value: false | ||||||
|
||||||
Controls whether the host pipe is implemented using the Control and Status register (CSR). | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`first_symbol_in_high_order_bits` | ||||||
| Valid values: true or false | ||||||
|
||||||
Default value: false | ||||||
|
||||||
Specifies whether the data symbols in the pipe are in big-endian | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By saying "the data symbols in the pipe" it sounds like we're referring to how symbols are ordered across the depth of the pipe, which is certainly not our intention. We should make it clear that this only refers to the ordering of symbols within a given word/transaction. Some alternatives are: ... the data symbols in each word written to/read from the pipe are ... |
||||||
order. | ||||||
|
||||||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||||||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||||||
|
||||||
|`protocol` | ||||||
| Specifies the protocol for the pipe interface. | ||||||
Currently, the only protocol supported is `avalon`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems strange to me to have a property with only one legal value that is also the default. I understand we're trying to leave the door open for other protocols in the future, but if that happens we could add the property at that time. It also seems weird to present protocol as its own independent property that is orthogonal to the other properties when a bunch of the other properties (bits_per_symbol, first_symbol_in_high_order_bits and ready_latency) are properties of the avalon ST spec and don't necessarily apply to other interface protocols. For example, AXI doesn't seem to have an endianness property so first_symbol_in_high_order_bits doesn't make sense with that protocol. Not to mention "first_symbol_in_high_order_bits" is the name pulled directly from the avalon spec and if we were trying to be generic we would call it "endianness" instead. I see four possible solutions:
|
||||||
Other protocols may be supported in the future. | ||||||
|
||||||
|
||||||
The default protocol is `avalon`. | ||||||
|==== | ||||||
-- | ||||||
|
||||||
== Revision History | ||||||
|
||||||
[cols="5,15,15,70"] | ||||||
[grid="rows"] | ||||||
[options="header"] | ||||||
|======================================== | ||||||
|Rev|Date|Author|Changes | ||||||
|1|2022-03-18|Peter Colberg|*Initial public working draft* | ||||||
|======================================== | ||||||
|
||||||
//************************************************************************ | ||||||
//Other formatting suggestions: | ||||||
// | ||||||
//* Use *bold* text for host APIs, or [source] syntax highlighting. | ||||||
//* Use +mono+ text for device APIs, or [source] syntax highlighting. | ||||||
//* Use +mono+ text for extension names, types, or enum values. | ||||||
//* Use _italics_ for parameters. | ||||||
//************************************************************************ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some general comments here, so the conversation can be threaded.
My question about breaking API compatibility in [SYCL][Doc] Update SYCL_INTEL_data_flow_pipes extension for FPGA host pipe support #5838 is clearly related to this PR also.
It's not clear to me why these properties should be in a separate extension. Why not add them to the existing "sycl_ext_intel_dataflow_pipes" extension?
Regardless of which extension they are in, we should define the behavior when these properties are used in a kernel that is submitted to a device that is not an FPGA. I think the two options are:
The
min_capacity
property does not seem specific to FPGA at all. Wouldn't this be relevant even if we implemented pipe on other devices?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gmlueck for the quick review!
@GarveyJoe, what are your thoughts on these and the other questions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API compatibility: I answered this in an email but I'll add it here for completeness: I think the solution is to keep min_capacity as a template parameter rather than pull it into the properties list. While this might not be how we would have written it if we were building this from scratch, this will allow us to avoid breaking existing code. min_capacity is sufficiently different from the other properties to warrant being separated from them. While the other properties merely control the interface of the pipes, min_capacity is a platform-agnostic property of the algorithm itself.
Separate extension: I'm not opposed to a single, combined extension.
Behaviour on non-FPGA targets: I prefer the 2nd option.
min_capacity: Greg is correct, min_capacity is different from the other properties. It can't be safely dropped as it is an algorithmic requirement needed to prevent deadlock. It must be respected on all targets.