-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][Doc] Add SYCL_INTEL_FPGA_data_flow_pipes_properties extension #5839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Doc] Add SYCL_INTEL_FPGA_data_flow_pipes_properties extension #5839
Conversation
See also #5838 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
f838d7a
to
b6a268b
Compare
@@ -0,0 +1,264 @@ | |||
= SYCL_INTEL_FPGA_data_flow_pipes_properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some general comments here, so the conversation can be threaded.
-
My question about breaking API compatibility in [SYCL][Doc] Update SYCL_INTEL_data_flow_pipes extension for FPGA host pipe support #5838 is clearly related to this PR also.
-
It's not clear to me why these properties should be in a separate extension. Why not add them to the existing "sycl_ext_intel_dataflow_pipes" extension?
-
Regardless of which extension they are in, we should define the behavior when these properties are used in a kernel that is submitted to a device that is not an FPGA. I think the two options are:
- These properties are an "optional kernel feature" as defined in section 5.7 of the core SYCL spec. In that case, submitting a kernel that uses these properties to a non-FPGA device would throw a synchronous exception at the time you submit the kernel.
- These properties are defined to be ignored when the kernel is submitted to a non-FPGA device. This is similar to what we did for the device global properties.
-
The
min_capacity
property does not seem specific to FPGA at all. Wouldn't this be relevant even if we implemented pipe on other devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gmlueck for the quick review!
@GarveyJoe, what are your thoughts on these and the other questions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API compatibility: I answered this in an email but I'll add it here for completeness: I think the solution is to keep min_capacity as a template parameter rather than pull it into the properties list. While this might not be how we would have written it if we were building this from scratch, this will allow us to avoid breaking existing code. min_capacity is sufficiently different from the other properties to warrant being separated from them. While the other properties merely control the interface of the pipes, min_capacity is a platform-agnostic property of the algorithm itself.
Separate extension: I'm not opposed to a single, combined extension.
Behaviour on non-FPGA targets: I prefer the 2nd option.
min_capacity: Greg is correct, min_capacity is different from the other properties. It can't be safely dropped as it is an algorithmic requirement needed to prevent deadlock. It must be respected on all targets.
implementation can include more capacity than this parameter, but not less. | ||
|
||
This property is not guaranteed to be respected if the pipe is an inter-kernel | ||
pipe. The compiler is allowed to optimize the pipe if both sides are visible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this can be ignored for inter-kernel pipes. As you say, the algorithm may depend on a certain minimum capacity.
In any event, isn't it always assumed that a compiler can optimize away code if it can prove that doing so has no effect? I don't think we need to say that in the spec if that's all we mean.
sycl/doc/extensions/proposed/sycl_ext_intel_fpga_pipes_properties.asciidoc
Outdated
Show resolved
Hide resolved
struct in_csr { | ||
template<bool Csr> | ||
using value_t = property_value<in_csr, std::integral_constant<bool, Csr>>; | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we potentially use the same in_csr property defined in device global? Given the behaviour will be similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that mean that the device_global extension needs to be enabled to use this property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could define the property in both extensions, so you could use it with either one. You would not need to enable the device global extension just to use the property.
Note that there is a proposal to rename the "implement_in_csr" property like this:
enum class hw_interface_enum : /* unspecified */ {
csr,
dedicated
};
struct hw_interface_key {
template<hw_interface_enum Interface>
using value_t =
property_value<hw_interface_key, std::integral_constant<int, Interface>>;
};
template <hw_interface_enum Interface>
inline constexpr hw_interface_key::value_t<Interface> hw_interface;
inline constexpr hw_interface_key::value_t<hw_interface_enum::csr>
hw_interface_csr;
inline constexpr hw_interface_key::value_t<hw_interface_enum::dedicated>
hw_interface_dedicated;
Does that new spelling still make sense for pipes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense, and it also makes this property more extensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "dedicated" isn't quite appropriate for host pipes. We can and will share a single hardware pipe among multiple logical pipes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing comes to mind immediately (but that doesn't mean there isn't one that makes sense for both). @GarveyJoe do you have any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"conduit" or "streaming" could both work. Alternatively, we could go back to it being a bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see either of those for pipe, but it's less clear to me they're descriptive of device_global (@artemrad ?) For bool, how confident are we won't eventually want another option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should in_csr be a separate protocol? It isn't an avalon-st interface anymore. It's closer to an avalon-mm interface. Some of the protocol-dependent properties don't even make sense for an "in_csr" pipe such as ready_latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting!
A minor remark: do not use uppercase like AXI
and AVALON
in the API but just the lowercase.
Otherwise, what to do if the programmer writes:
// Use the AXI protocol in the following interface
#define AXI 1
and then the havoc happens. And we all know how FPGA programmers love low-level C and preprocessor. ;-)
FYI, there is a pull request suggesting a new guideline for naming various things - #5843. Feel free to contribute your thoughts there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting!
A minor remark: do not use uppercase like AXI
and AVALON
in the API but just the lowercase.
Otherwise, what to do if the programmer writes:
// Use the AXI protocol in the following interface
#define AXI 1
and then the havoc happens. And we all know how FPGA programmers love low-level C and preprocessor. ;-)
5e70824
to
713eb4d
Compare
Defines new properties for data flow pipes Properties defined to align with spec in [1] Existing data flow pipe defined in [2] data flow pipe will be accepting new property list, spec in [3] [1] intel#5839 [2] https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_intel_dataflow_pipes.asciidoc [3] intel#5838
What is this for: pipes expose the concept of a first in first out buffer, this FIFO construct provide link between elements of a design that are accessed through read/write/push/pop APIs. A host pipe is a pipe that links a device kernel with host program. This extension is framed from FPGA perspective. This change add required interface for the integration footer to register the `host_pipe` of a program as well as reading extended info supplied through "SYCL/host pipes" property. Info is stored in a map managed by program manager. The integration header and footer provides a mapping from the host address of each pipe variable to the unique string for that variable. This is required so that sycl runtime can query the pipe address from the given pipe name, and pass both into opencl runtime function calls. Opencl defines pipes, which are FIFO constructs that are consistent with Khronos specification. Spec link: intel#5838 List of suported properties: intel#5839 Note: it is the first change to runtime relating to host_pipe, thus the feature is not complete / fully testable. It is intended to add an interface for integration footer as well as consumer for the information sycl-post-link will be generating when future work is added.
The number of cycles between when the ready signal is deasserted and when the | ||
pipe can no longer accept new inputs. | ||
|
||
This property is not guaranteed to be respected if the pipe is an inter-kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove min_capacity from the list of properties than this applies to all properties. Perhaps we should factor it out of the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless of whether or not we factor it out, I think we need to tweak this wording to make it clearer when the properties can be ignored. We could say something like:
"On FPGA targets this property provides control over the hardware interface built for the pipe at the boundary of a device image. If both endpoints of this pipe reside in the same device_image then this property is meaningless and can be ignored. On non-FPGA targets this property is meaningless and can be ignored. "
Controls whether a valid signal is present on the pipe interface. If false, the | ||
upstream source must provide valid data on every cycle that ready is asserted. | ||
|
||
This is equivalent to changing the pipe read calls to tryRead and assuming that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no API named "tryRead"; the non-blocking version is also named "read". It merely has different function arguments. Also, the argument is named "success_code" not "success". Instead, let's say:
"This is equivalent to changing the pipe read calls to non-blocking reads and assuming that success_code is always true."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current API is unclear and confusing.
I think try_read
and try_write
should be introduced, as suggested by #832 and the cited PR comments there.
But this is a long story, waiting for 3 years already. We have not implemented it on AMD FPGA yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
Controls whether a ready signal is present. If false, the downstream sink must | ||
be able to accept data on every cycle that valid is asserted. This is | ||
equivalent to changing the pipe read calls to tryWrite and assuming that success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equivalent to changing the pipe read calls to tryWrite and assuming that success | |
equivalent to changing the pipe write calls to non-blocking writes and assuming that success_code |
pipe. The compiler is allowed to optimize the pipe if both sides are visible. | ||
|
||
|`bits_per_symbol` | ||
| Valid values: A positive integer value that evenly divides by the data type size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this say:
| Valid values: A positive integer value that evenly divides by the data type size. | |
| Valid values: A positive integer value that evenly divides the data type size. |
My understanding is that you can have multiple symbols per element in the pipe but not multiple pipe elements per symbol. The most common case is that the bits_per_symbol is 8, i.e. one symbol is one byte.
|
||
Describes how the data is broken into symbols on the data bus. | ||
|
||
Data is broken down according to how you set the first_symbol_in_high_order_bits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to mention first_symbol_in_high_order_bits here. It already has its own definition further down and mentioning it here doesn't make the definition of bits_per_symbol any more clear. If you extended this with an example that showed how the two properties interacted that might be helpful but, as is, this sentence contributes nothing.
|
||
|`protocol` | ||
| Specifies the protocol for the pipe interface. | ||
Currently, the only protocol supported is `avalon`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems strange to me to have a property with only one legal value that is also the default. I understand we're trying to leave the door open for other protocols in the future, but if that happens we could add the property at that time.
It also seems weird to present protocol as its own independent property that is orthogonal to the other properties when a bunch of the other properties (bits_per_symbol, first_symbol_in_high_order_bits and ready_latency) are properties of the avalon ST spec and don't necessarily apply to other interface protocols. For example, AXI doesn't seem to have an endianness property so first_symbol_in_high_order_bits doesn't make sense with that protocol. Not to mention "first_symbol_in_high_order_bits" is the name pulled directly from the avalon spec and if we were trying to be generic we would call it "endianness" instead.
I see four possible solutions:
- Drop the protocol property completely and accept that for now all our properties are avalon-centric. This is the least work now but might require the most refactoring in the future to support other protocols.
- Make the protocol-dependent properties part of the protocol property. This way different protocols can have different sub-properties as determined by their various protocol specifications.
- Try to come up with generic names for every property that will apply to any possible future protocol. I think this is doomed to failure (see the endianness AXI interfaction).
- Keep separate properties like we have in this PR but add extra restrictions to the protocol-dependent properties that say: "this property is only valid if specified in conjunction with the protocol property"
|
||
Default value: 0 | ||
|
||
The number of cycles between when the ready signal is deasserted and when the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a looser definition than that in the avalon streaming specification. Specifically, the avalon spec also mandates that the upstream source cannot assert valid until ready_latency cycles after the downstream source has asserted ready. Is this intended? If we intend to mimic the avalon spec exactly should we change this to: "The ready_latency property of the stream as defined in the Avalon-RT specification"?
|
||
Default value: false | ||
|
||
Specifies whether the data symbols in the pipe are in big-endian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By saying "the data symbols in the pipe" it sounds like we're referring to how symbols are ordered across the depth of the pipe, which is certainly not our intention. We should make it clear that this only refers to the ordering of symbols within a given word/transaction. Some alternatives are:
... the data symbols in each word written to/read from the pipe are ...
... the data symbols in each transaction are ...
... the data symbols in the data channel of the pipe are ...
The first_symbol_in_high_order_bits property of the stream as defined in the Avalon-ST specification.
Defines new properties for data flow pipes Properties defined to align with spec in [1] Existing data flow pipe defined in [2] data flow pipe will be accepting new property list, spec in [3] [1] intel#5839 [2] https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_intel_dataflow_pipes.asciidoc [3] intel#5838
See also #5838
Cc: @mkinsner @GarveyJoe @aditikum @rho180 @zibaiwan