Skip to content

[SYCL][Doc] Add SYCL_INTEL_FPGA_data_flow_pipes_properties extension #5839

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pcolberg
Copy link
Contributor

@pcolberg pcolberg commented Mar 19, 2022

See also #5838

Signed-off-by: Peter Colberg <peter.colberg@intel.com>
@pcolberg pcolberg requested a review from a team as a code owner March 19, 2022 01:42
@pcolberg pcolberg marked this pull request as draft March 19, 2022 01:43
@pcolberg pcolberg force-pushed the sycl_ext_intel_fpga_data_flow_pipes_properties branch 3 times, most recently from f838d7a to b6a268b Compare March 19, 2022 16:54
@@ -0,0 +1,264 @@
= SYCL_INTEL_FPGA_data_flow_pipes_properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some general comments here, so the conversation can be threaded.

  • My question about breaking API compatibility in [SYCL][Doc] Update SYCL_INTEL_data_flow_pipes extension for FPGA host pipe support #5838 is clearly related to this PR also.

  • It's not clear to me why these properties should be in a separate extension. Why not add them to the existing "sycl_ext_intel_dataflow_pipes" extension?

  • Regardless of which extension they are in, we should define the behavior when these properties are used in a kernel that is submitted to a device that is not an FPGA. I think the two options are:

    • These properties are an "optional kernel feature" as defined in section 5.7 of the core SYCL spec. In that case, submitting a kernel that uses these properties to a non-FPGA device would throw a synchronous exception at the time you submit the kernel.
    • These properties are defined to be ignored when the kernel is submitted to a non-FPGA device. This is similar to what we did for the device global properties.
  • The min_capacity property does not seem specific to FPGA at all. Wouldn't this be relevant even if we implemented pipe on other devices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gmlueck for the quick review!

@GarveyJoe, what are your thoughts on these and the other questions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API compatibility: I answered this in an email but I'll add it here for completeness: I think the solution is to keep min_capacity as a template parameter rather than pull it into the properties list. While this might not be how we would have written it if we were building this from scratch, this will allow us to avoid breaking existing code. min_capacity is sufficiently different from the other properties to warrant being separated from them. While the other properties merely control the interface of the pipes, min_capacity is a platform-agnostic property of the algorithm itself.

Separate extension: I'm not opposed to a single, combined extension.

Behaviour on non-FPGA targets: I prefer the 2nd option.

min_capacity: Greg is correct, min_capacity is different from the other properties. It can't be safely dropped as it is an algorithmic requirement needed to prevent deadlock. It must be respected on all targets.

implementation can include more capacity than this parameter, but not less.

This property is not guaranteed to be respected if the pipe is an inter-kernel
pipe. The compiler is allowed to optimize the pipe if both sides are visible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this can be ignored for inter-kernel pipes. As you say, the algorithm may depend on a certain minimum capacity.

In any event, isn't it always assumed that a compiler can optimize away code if it can prove that doing so has no effect? I don't think we need to say that in the spec if that's all we mean.

Comment on lines +137 to +141
struct in_csr {
template<bool Csr>
using value_t = property_value<in_csr, std::integral_constant<bool, Csr>>;
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we potentially use the same in_csr property defined in device global? Given the behaviour will be similar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that mean that the device_global extension needs to be enabled to use this property?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could define the property in both extensions, so you could use it with either one. You would not need to enable the device global extension just to use the property.

Note that there is a proposal to rename the "implement_in_csr" property like this:

enum class hw_interface_enum : /* unspecified */ {
  csr,
  dedicated
};

struct hw_interface_key {
  template<hw_interface_enum Interface>
  using value_t =
      property_value<hw_interface_key, std::integral_constant<int, Interface>>;
};

template <hw_interface_enum Interface>
inline constexpr hw_interface_key::value_t<Interface> hw_interface;
inline constexpr hw_interface_key::value_t<hw_interface_enum::csr>
    hw_interface_csr;
inline constexpr hw_interface_key::value_t<hw_interface_enum::dedicated>
    hw_interface_dedicated;

Does that new spelling still make sense for pipes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense, and it also makes this property more extensible.

Copy link
Contributor

@rho180 rho180 Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "dedicated" isn't quite appropriate for host pipes. We can and will share a single hardware pipe among multiple logical pipes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing comes to mind immediately (but that doesn't mean there isn't one that makes sense for both). @GarveyJoe do you have any ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding myself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"conduit" or "streaming" could both work. Alternatively, we could go back to it being a bool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see either of those for pipe, but it's less clear to me they're descriptive of device_global (@artemrad ?) For bool, how confident are we won't eventually want another option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should in_csr be a separate protocol? It isn't an avalon-st interface anymore. It's closer to an avalon-mm interface. Some of the protocol-dependent properties don't even make sense for an "in_csr" pipe such as ready_latency.

Copy link
Contributor

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!
A minor remark: do not use uppercase like AXI and AVALONin the API but just the lowercase.
Otherwise, what to do if the programmer writes:

// Use the AXI protocol in the following interface
#define AXI 1

and then the havoc happens. And we all know how FPGA programmers love low-level C and preprocessor. ;-)

@bader
Copy link
Contributor

bader commented Mar 22, 2022

Interesting! A minor remark: do not use uppercase like AXI and AVALONin the API but just the lowercase. Otherwise, what to do if the programmer writes:

// Use the AXI protocol in the following interface
#define AXI 1

and then the havoc happens. And we all know how FPGA programmers love low-level C and preprocessor. ;-)

FYI, there is a pull request suggesting a new guideline for naming various things - #5843. Feel free to contribute your thoughts there.

Copy link
Contributor

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!
A minor remark: do not use uppercase like AXI and AVALONin the API but just the lowercase.
Otherwise, what to do if the programmer writes:

// Use the AXI protocol in the following interface
#define AXI 1

and then the havoc happens. And we all know how FPGA programmers love low-level C and preprocessor. ;-)

@pcolberg pcolberg force-pushed the sycl_ext_intel_fpga_data_flow_pipes_properties branch from 5e70824 to 713eb4d Compare March 22, 2022 18:56
sherry-yuan added a commit to sherry-yuan/llvm that referenced this pull request Mar 25, 2022
Defines new properties for data flow pipes

Properties defined to align with spec in [1]

Existing data flow pipe defined in [2]

data flow pipe will be accepting new property list, spec in [3]

[1] intel#5839
[2] https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_intel_dataflow_pipes.asciidoc
[3] intel#5838
@pcolberg pcolberg self-assigned this Mar 25, 2022
sherry-yuan added a commit to sherry-yuan/llvm that referenced this pull request Mar 28, 2022
What is this for: pipes expose the concept of a first in first out buffer,
 this FIFO construct provide link between elements of a design that are
accessed through read/write/push/pop APIs.
A host pipe is a pipe that links a device kernel with host program.
This extension is framed from FPGA perspective.

This change add required interface for the integration footer to register
the `host_pipe` of a program as well as reading extended info supplied
through "SYCL/host pipes" property.
Info is stored in a map managed by program manager.

The integration header and footer provides a mapping from the host address of
each pipe variable to the unique string for that variable.
This is required so that sycl runtime can query the pipe address from the
given pipe name, and pass both into opencl runtime function calls.

Opencl defines pipes, which are FIFO constructs that are consistent with Khronos specification.

Spec link: intel#5838
List of suported properties: intel#5839

Note: it is the first change to runtime relating to host_pipe,
thus the feature is not complete / fully testable.
It is intended to add an interface for integration footer as well as
consumer for the information sycl-post-link will be generating when
future work is added.
The number of cycles between when the ready signal is deasserted and when the
pipe can no longer accept new inputs.

This property is not guaranteed to be respected if the pipe is an inter-kernel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove min_capacity from the list of properties than this applies to all properties. Perhaps we should factor it out of the table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of whether or not we factor it out, I think we need to tweak this wording to make it clearer when the properties can be ignored. We could say something like:

"On FPGA targets this property provides control over the hardware interface built for the pipe at the boundary of a device image. If both endpoints of this pipe reside in the same device_image then this property is meaningless and can be ignored. On non-FPGA targets this property is meaningless and can be ignored. "

Controls whether a valid signal is present on the pipe interface. If false, the
upstream source must provide valid data on every cycle that ready is asserted.

This is equivalent to changing the pipe read calls to tryRead and assuming that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no API named "tryRead"; the non-blocking version is also named "read". It merely has different function arguments. Also, the argument is named "success_code" not "success". Instead, let's say:

"This is equivalent to changing the pipe read calls to non-blocking reads and assuming that success_code is always true."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current API is unclear and confusing.
I think try_read and try_write should be introduced, as suggested by #832 and the cited PR comments there.
But this is a long story, waiting for 3 years already. We have not implemented it on AMD FPGA yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keryell, for simplicity I don't want to combine this change with a possible change to the non-blocking API, but let's discuss this further in #832.


Controls whether a ready signal is present. If false, the downstream sink must
be able to accept data on every cycle that valid is asserted. This is
equivalent to changing the pipe read calls to tryWrite and assuming that success
Copy link
Contributor

@GarveyJoe GarveyJoe Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
equivalent to changing the pipe read calls to tryWrite and assuming that success
equivalent to changing the pipe write calls to non-blocking writes and assuming that success_code

pipe. The compiler is allowed to optimize the pipe if both sides are visible.

|`bits_per_symbol`
| Valid values: A positive integer value that evenly divides by the data type size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this say:

Suggested change
| Valid values: A positive integer value that evenly divides by the data type size.
| Valid values: A positive integer value that evenly divides the data type size.

My understanding is that you can have multiple symbols per element in the pipe but not multiple pipe elements per symbol. The most common case is that the bits_per_symbol is 8, i.e. one symbol is one byte.


Describes how the data is broken into symbols on the data bus.

Data is broken down according to how you set the first_symbol_in_high_order_bits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to mention first_symbol_in_high_order_bits here. It already has its own definition further down and mentioning it here doesn't make the definition of bits_per_symbol any more clear. If you extended this with an example that showed how the two properties interacted that might be helpful but, as is, this sentence contributes nothing.


|`protocol`
| Specifies the protocol for the pipe interface.
Currently, the only protocol supported is `avalon`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to me to have a property with only one legal value that is also the default. I understand we're trying to leave the door open for other protocols in the future, but if that happens we could add the property at that time.

It also seems weird to present protocol as its own independent property that is orthogonal to the other properties when a bunch of the other properties (bits_per_symbol, first_symbol_in_high_order_bits and ready_latency) are properties of the avalon ST spec and don't necessarily apply to other interface protocols. For example, AXI doesn't seem to have an endianness property so first_symbol_in_high_order_bits doesn't make sense with that protocol. Not to mention "first_symbol_in_high_order_bits" is the name pulled directly from the avalon spec and if we were trying to be generic we would call it "endianness" instead.

I see four possible solutions:

  1. Drop the protocol property completely and accept that for now all our properties are avalon-centric. This is the least work now but might require the most refactoring in the future to support other protocols.
  2. Make the protocol-dependent properties part of the protocol property. This way different protocols can have different sub-properties as determined by their various protocol specifications.
  3. Try to come up with generic names for every property that will apply to any possible future protocol. I think this is doomed to failure (see the endianness AXI interfaction).
  4. Keep separate properties like we have in this PR but add extra restrictions to the protocol-dependent properties that say: "this property is only valid if specified in conjunction with the protocol property"


Default value: 0

The number of cycles between when the ready signal is deasserted and when the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a looser definition than that in the avalon streaming specification. Specifically, the avalon spec also mandates that the upstream source cannot assert valid until ready_latency cycles after the downstream source has asserted ready. Is this intended? If we intend to mimic the avalon spec exactly should we change this to: "The ready_latency property of the stream as defined in the Avalon-RT specification"?


Default value: false

Specifies whether the data symbols in the pipe are in big-endian
Copy link
Contributor

@GarveyJoe GarveyJoe Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By saying "the data symbols in the pipe" it sounds like we're referring to how symbols are ordered across the depth of the pipe, which is certainly not our intention. We should make it clear that this only refers to the ordering of symbols within a given word/transaction. Some alternatives are:

... the data symbols in each word written to/read from the pipe are ...
... the data symbols in each transaction are ...
... the data symbols in the data channel of the pipe are ...
The first_symbol_in_high_order_bits property of the stream as defined in the Avalon-ST specification.

sherry-yuan added a commit to sherry-yuan/llvm that referenced this pull request Apr 21, 2022
Defines new properties for data flow pipes

Properties defined to align with spec in [1]

Existing data flow pipe defined in [2]

data flow pipe will be accepting new property list, spec in [3]

[1] intel#5839
[2] https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_intel_dataflow_pipes.asciidoc
[3] intel#5838
@pcolberg pcolberg removed their assignment Aug 10, 2022
@github-actions github-actions bot added the Stale label Feb 7, 2023
@github-actions github-actions bot closed this Mar 9, 2023
@pcolberg pcolberg deleted the sycl_ext_intel_fpga_data_flow_pipes_properties branch March 15, 2025 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants