Skip to content

[SYCL][DOC] Make some refactoring for group sorting spec #4666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 17, 2021
Merged

[SYCL][DOC] Make some refactoring for group sorting spec #4666

merged 7 commits into from
Dec 17, 2021

Conversation

andreyfe1
Copy link
Contributor

@andreyfe1 andreyfe1 commented Sep 29, 2021

No changes for interfaces. Just some clarifications to avoid misunderstanding.

Following was done:

  1. Moved sorting functions before sorters.
  2. Added a comment for bubble_sort that this is not a recommended implementation for performance.
  3. Added a comment that users shouldn’t create their own Group Helper. group_with_scratchpad is enough.
  4. Changed the status for the Spec in the README.md.
  5. Made an entire extension experimental.

Signed-off-by: Fedorov, Andrey andrey.fedorov@intel.com

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>
@andreyfe1 andreyfe1 requested a review from a team as a code owner September 29, 2021 16:03
@andreyfe1
Copy link
Contributor Author

Thanks to @gmlueck for useful offline comments.

@andreyfe1
Copy link
Contributor Author

@gmlueck, could you please help to review changes?

@gmlueck
Copy link
Contributor

gmlueck commented Nov 1, 2021

|template<typename Group, typename Ptr> void operator()(Group g, Ptr first, Ptr last);
|Implements a sorting algorithm that calls by joint_sort.
Available only if sycl::is_group_v<std::decay_t<Group>> is true.
first, last must be the same for all work-items in the group.

Sorry, GitHub does not allow me to comment on lines you didn't change in this PR. The above is the description of operator() on the sorter class. I think the last sentence about first and last isn't worded the right way. This is not a requirement on the application. Rather, it is a guarantee that the implementation makes. I think a better wording would be like this:

The application defines this function to implement a sorting algorithm for joint_sort. The implementation calls this member function once from every work item in the group, passing the same values of first and last that were passed to joint_sort. The Group template parameter is set according to the matching call to joint_sort.

For the other overload of operator():

The application defines this function to implement a sorting algorithm for sort_over_group. The implementation calls this member function once from every work item in the group, passing the work item's value. The function must return the value corresponding to this work-item's position in the sorted range. The Group template parameter is set according to the matching call to sort_over_group.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 1, 2021

   template<typename T>
   static constexpr std::size_t
   memory_required(sycl::memory_scope scope, std::size_t range_size);
   template<typename T, int dimensions>
   static constexpr std::size_t
   memory_required(sycl::memory_scope scope, sycl::range<dimensions> local_range);

I was wondering about the names of these member functions from default_sorter and radix_sorter. Do we think it will be clear to users about which one to use for joint_sort vs. sort_over_group? I wonder if it would be clearer to name them memory_required_joint_sort and memory_required_sort_over_group? Or maybe joint_sort_memory_size and sort_over_group_memory_size?

@gmlueck
Copy link
Contributor

gmlueck commented Nov 1, 2021

I found the description of "Group Helper" confusing at first. I think it would help to replace this introduction:

To pass additional memory to algorithms that don’t have the Sorter parameter SYCL introduces special type: group helper. It encapsulates a group and a memory.

Group helper must have following methods:

With:

The overloads of joint_sort and sort_over_group that do not take a Sorter parameter implicitly use the default sorter. Since the default sorter requires the application to allocate some temporary memory, the application must use a GroupHelper object to communicate the location of this memory. A GroupHelper object is any object that has the following two public member functions:

I also think the definition of get_memory could be clarified. This is what you have now:

/* unspecified */ get_memory() const

Returns the memory object that represents a memory handled by the group helper object.
A type of the returned value must be the same as the type of the default_sorter 's constructor
that passes an additional memory to default_sorter.

I don't understand why the return type here is unspecified. Doesn't the return type need to be sycl::span<std::byte, Extent>? Therefore, it seems like the description could be like this:

sycl::span<std::byte, Extent> get_memory() const

Returns the temporary memory object that the default sorter can use. The Extent must be defined according to the value returned from default_sorter::memory_required.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 1, 2021

One more global comment. I think we should make this entire extension "experimental" and move all of the APIs into the "ext::oneapi::experimental" namespace. My reasoning is that it is almost impossible to use this API today in a non-experimental way. All of the overloads of joint_sort and sort_over_group that do not take a Sorter parameter require the application to pass a GroupHelper. However, the only predefined Group Helper is in the experimental namespace.

Moreover, if the application calls one of the joint_sort or sort_over_group overloads that does take a Sorter parameter, it will almost certainly pass one of the predefined sorters. However, all of the predefined sorters are also in the experimental namespace.

Therefore, I think that almost every application that uses this extension will end up using experimental features, so I don't see any value in calling some parts non-experimental. I think we should just say the whole thing is experimental until we are ready to make the predefined Group Helper and the predefined sorters non-experimental.

Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have started my comments by saying that I like the new organization a lot. I think it's much clearer now that joint_sort and sort_over_group are explained first.

I have a couple final comments below.

@andreyfe1
Copy link
Contributor Author

@gmlueck, thank you for your comments!

I think the last sentence about first and last isn't worded the right way. This is not a requirement on the application. Rather, it is a guarantee that the implementation makes.

Could you please clarify the difference between "requirement on the application" and "guarantee that the implementation makes"? Here we need to say users on the Spec level that they need to pass parameters that are same for all work-items in one work-group or sub-group. Sorry, I couldn't figure out why it's "a guarantee that the implementation makes". By the way, the same wording is for others Group algorithms, e.g. joint_reduce.

I was wondering about the names of these member functions from default_sorter and radix_sorter. Do we think it will be clear to users about which one to use for joint_sort vs. sort_over_group? I wonder if it would be clearer to name them memory_required_joint_sort and memory_required_sort_over_group? Or maybe joint_sort_memory_size and sort_over_group_memory_size?

It was already discussed here Names are too long and sort is already in sorter names. In user's code it can look like following:

default_sorter<Compare>::memory_required_sort_over_group(/*...*/);

I don't understand why the return type here is unspecified. Doesn't the return type need to be sycl::span<std::byte, Extent>?

The thing is we are not sure that each group helper should contain Extent as a template parameter. It's ok for group_with_scratchpad class, but we decided not to specify it for the Group Helper family.

One more global comment. I think we should make this entire extension "experimental" and move all of the APIs into the "ext::oneapi::experimental" namespace. My reasoning is that it is almost impossible to use this API today in a non-experimental way. All of the overloads of joint_sort and sort_over_group that do not take a Sorter parameter require the application to pass a GroupHelper. However, the only predefined Group Helper is in the experimental namespace.

I see your point. It looks like users need to pass any experimental thing when they call non-experimental function. However, joint_sort and sort_over_group are already implemented and if we move functions to experimental it can be an API breaking change.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 3, 2021

Here we need to say users on the Spec level that they need to pass parameters that are same for all work-items in one work-group or sub-group. Sorry, I couldn't figure out why it's "a guarantee that the implementation makes".

Maybe I'm confused, but I thought it was the library that calls the custom sorter's operator() function. Is that right? The current wording:

first, last must be the same for all work-items in the group.

makes it sound like the application needs to pass the same values of first and last to each work-item. However, the application doesn't call this function at all, so this is not a requirement on the application. Rather, we want to say that the application can assume that the library does pass the same values of first and last to each work item.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 3, 2021

It was already discussed here Names are too long and sort is already in sorter names.

OK, I see there was already a discussion about this. I guess this part of the API is experimental, so we will get feedback from customers about whether it is confusing.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 3, 2021

The thing is we are not sure that each group helper should contain Extent as a template parameter. It's ok for group_with_scratchpad class, but we decided not to specify it for the Group Helper family.

But the spec says this currently:

A type of the returned value must be the same as the type of the default_sorter 's constructor that passes an additional memory to default_sorter.

There is only one constructor for default_sorter and the parameter "that passes an additional memory" has the type sycl::span<std::byte, Extent>. Therefore, it seems like the spec already says that the type of the return value is sycl::span<std::byte, Extent>, but it says this in a very non-obvious way.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 3, 2021

I see your point. It looks like users need to pass any experimental thing when they call non-experimental function. However, joint_sort and sort_over_group are already implemented and if we move functions to experimental it can be an API breaking change.

Hmm. Have we made a DPC++ release with this support yet?

@andreyfe1
Copy link
Contributor Author

Maybe I'm confused, but I thought it was the library that calls the custom sorter's operator() function. Is that right?

Ah, by library you mean implementation? Then, you're right.

There is only one constructor for default_sorter and the parameter "that passes an additional memory" has the type sycl::span<std::byte, Extent>. Therefore, it seems like the spec already says that the type of the return value is sycl::span<std::byte, Extent>, but it says this in a very non-obvious way.

Correct. Will think about it.

Have we made a DPC++ release with this support yet?

As far as I know, no.

@andreyfe1
Copy link
Contributor Author

@gmlueck, I've addressed most of your comments.
Regarding /* unspecified */ get_memory() const
What if we say following?

/* unspecified */ get_memory() const
Returns the temporary memory object that the default sorter can use.

I'm imaging that get_memory() for some Group Helper can return std::pair<std::byte*, size_t>. It can be wrapped to sycl::span and passed to default_sorter ctor.
Another variant is to always return sycl::span<std::byte, sycl::dynamic_extent>.
Both variants help to avoid the Extent template parameter.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 9, 2021

I'm imaging that get_memory() for some Group Helper can return std::pair<std::byte*, size_t>. It can be wrapped to sycl::span and passed to default_sorter ctor.
Another variant is to always return sycl::span<std::byte, sycl::dynamic_extent>.
Both variants help to avoid the Extent template parameter.

Maybe I'm confused about how the GroupHelper works. I thought it was the implementation of joint_sort() and sort_over_group() that called the GroupHelper get_memory() function. Is that not the case?

The application doesn't control the implementation of joint_sort() or sort_over_group(). These are part of the DPC++ implementation. When we (Intel) code the implementation of joint_sort() and sort_over_group() don't we need to know what the return type is for get_memory()? The logic will be completely different if the return type is std::pair<std::byte*, size_t> vs. sycl::span<std::byte, sycl::dynamic_extent>.

@andreyfe1
Copy link
Contributor Author

andreyfe1 commented Nov 9, 2021

I thought it was the implementation of joint_sort() and sort_over_group() that called the GroupHelper get_memory() function. Is that not the case?

You're right. The best variant is to mention explicitly that we require sycl::span here as default_sorter. If we change default_sorter then we need to change the get_memory() return type.

UPD: made corresponding changes

@andreyfe1
Copy link
Contributor Author

@gmlueck, @Pennycook, @dmitriy-sobolev, changes are applied. Feel free to review

@andreyfe1
Copy link
Contributor Author

@gmlueck, @Pennycook, feel free to approve if changes are good enough to be merged.

@gmlueck
Copy link
Contributor

gmlueck commented Nov 17, 2021

@gmlueck, @Pennycook, feel free to approve if changes are good enough to be merged.

I thought you were going to add some NOTE to the spec saying which features are not implemented yet? Someone reading that spec needs some way to know which features are safe to use.

@andreyfe1
Copy link
Contributor Author

@gmlueck, I think Release Notes for oneAPI/DPC++ releases is a better place to mention what is implemented. In my opinion everything that is related to the specific release should be mentioned in Release Notes. So, users will also know which oneAPI/DPC++ Compiler release supports features form the Spec. It also helps to not mix specification and implementation in one document.

@bader bader requested a review from Pennycook December 17, 2021 08:27
@bader bader changed the title [SYCL][DOC] Some refactoring for Group sorting Spec [SYCL][DOC] Make some refactoring for group sorting spec Dec 17, 2021
@bader bader merged commit f1c3506 into intel:sycl Dec 17, 2021
bader pushed a commit that referenced this pull request Dec 22, 2021
…5169)

Changes for the spec: #4666
Test: intel/llvm-test-suite#633

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>
bader pushed a commit to intel/llvm-test-suite that referenced this pull request Dec 22, 2021
Spec changes: intel/llvm#4666
Implementation changes: intel/llvm#5169

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
…tel/llvm-test-suite#633)

Spec changes: intel#4666
Implementation changes: intel#5169

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants