[DOC] Extend group sort algorithms with interfaces to scratchpad memory #3989

andreyfe1 · 2021-06-24T15:45:22Z

Previously it was found that we need an additional local memory for more performant implementation. The proposal fixes it.
Previous version of proposal was discussed here: #3514
Design review: #3754

Signed-off-by: Fedorov, Andrey andrey.fedorov@intel.com

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 · 2021-06-24T15:50:22Z

Questions to discuss:

Here I propose additional arguments near the Group (https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R295). However, there was another suggestion to use policy-like object for such purpose. See details: [SYCL] [DOC] Group sorting algorithm design review #3754 (comment) Let's think what is better?
What if memory provided by users is not enough? Here I suggest UB (https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R319). Any other opinions?

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 · 2021-06-28T09:45:33Z

Let's also think about names for memory helper. Can we imagine better names?
Also I suggest to think about semantics for such helpers within the default sorter. It's marked as unspecified since implementation can potentially require more information to make a decision.

Inviting @Pennycook, @gmlueck, @akukanov to the discussion of the PR

Pennycook · 2021-06-28T14:09:04Z

Here I propose additional arguments near the Group (https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R295). However, there was another suggestion to use policy-like object for such purpose. See details: [SYCL] [DOC] Group sorting algorithm design review #3754 (comment) Let's think what is better?

I'm still in favor of a policy-like object, because I think it's more aligned with the executors proposal. Looking forward, I'd expect all C++ algorithms (including SYCL algorithms) to accept a single argument encapsulating all information about the resources that an algorithm can use for execution.

A policy-like object also scales better in terms of the number of overloads and cognitive burden: if it turns out that we need to pass something else to the algorithm as well (e.g. that the group has a known fixed size and that it can use some scratch space) we can modify the policy rather than adding more arguments to the algorithm.

What if memory provided by users is not enough? Here I suggest UB (https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R319). Any other opinions?

UB seems reasonable to me. We could add a non-normative note saying that this may change in future revisions of the specification, if you want to keep the discussion going. I don't think we can really have good error behavior unless the core SYCL specification introduces a mechanism to handle run-time errors on the device.

Let's also think about names for memory helper. Can we imagine better names?

I think something like required_memory would be clearer. You could use exactly the same name if we use overloads:

template</*unspecified*/>
static std::size_t memory_required(std::size_t range); // memory required for joint_sort on range

template</*unspecified*/>
static std::size_t memory_required(Group g, sycl::range<D> local_range); // memory required for sort_over_group on group with specified local range

andreyfe1 · 2021-06-29T11:02:39Z

@Pennycook, thanks for the opinion. I see your point. Seems, it's better to speak about executor-like object, not policy-like as an alternative for adding arguments to functions. From my point of view policy is often light-weight object that doesn't contain everything that algorithms need.

I like the idea with overloading for memory_required. Here we introduce 1 new name, not 2 new names. Regarding the memory function for sort_over_group. We need the work group size. The function needs to be called on host to create an accessor (it's not possible in kernels). So, Group g is unavailable at host. What is a better way to say that we need the local size that goes to nd_range creation? I'd suggest following:

template</*unspecified*/>
static std::size_t memory_required(std::size_t range); // memory required for joint_sort on range

template</*unspecified*/>
static std::size_t memory_required(sycl::range<dim> local_range); // memory required for sort_over_group on groups with specified local range

Just to consider other names for function to have a choice: get_temp_memory_size, get_local_memory_size, get_local_mem_size

Pennycook · 2021-06-29T14:18:08Z

Seems, it's better to speak about executor-like object, not policy-like as an alternative for adding arguments to functions. From my point of view policy is often light-weight object that doesn't contain everything that algorithms need.

That's a good point. I'll say executor-like going forward.

The function needs to be called on host to create an accessor (it's not possible in kernels). So, Group g is unavailable at host. What is a better way to say that we need the local size that goes to nd_range creation? I'd suggest following:

Ah, I'd forgotten about that. I think we can't just remove the argument completely, unfortunately -- some implementations may need the user to provide some additional memory for work-groups and sub-groups (or maybe even for other group types introduced by other extensions). Saying this now, I realize that we might need the group argument for joint_sort too.

Would it make sense to add the group type as a template argument?

template<typename Group, /*unspecified*/>
static std::size_t memory_required(std::size_t range); // memory required for joint_sort on range

template<typename Group, /*unspecified*/>
static std::size_t memory_required(sycl::range<dim> local_range); // memory required for sort_over_group on groups with specified local range

andreyfe1 · 2021-06-29T15:05:27Z

It makes sense. Implementation for sycl::group and sycl::sub_group may differ and may require different amount of memory. However, if we have it's as an additional template parameter, how we can specify it for group and sub-group since partial specialization for functions doesn't work? I'd suggest following:

template<typename Group, /*unspecified*/>
static std::size_t memory_required(Group, std::size_t range); // memory required for joint_sort on range

template<typename Group, /*unspecified*/>
static std::size_t memory_required(Group, sycl::range<dim> local_range); // memory required for sort_over_group on groups with specified local range

Thus, we can specify it as following

template</*unspecified*/>
static std::size_t memory_required(sycl::group, std::size_t range); // for sycl::group

template</*unspecified*/>
static std::size_t memory_required(sycl::sub_group, std::size_t range); // for sycl::sub_group

Pennycook · 2021-06-29T15:16:45Z

However, if we have it's as an additional template parameter, how we can specify it for group and sub-group since partial specialization for functions doesn't work? I'd suggest following:

You're right. Sorry, I always forget that.

Doesn't what you've written there require sycl::group and sycl::sub_group to be constructible on the host? I'm worried about exposing a default constructor for those types: it would enable the member functions of the groups to be called on the host, and it would enable the groups to be constructed on the device without an nd_item. Both of those are problematic.

I can see two solutions, neither of which is perfect:

Make the helper into a type instead!
Use the memory_scope enum: memory_required(memory_scope::work_group, range<>)

andreyfe1 · 2021-06-29T15:23:06Z

I see, it's the same trap with Group availability on device only. Since we need group argument just a tag, we can use memory_scope. What about group types introduced by other extensions? Will it work? Or to have a class instead of function is the only way for such group types?

Pennycook · 2021-06-29T15:33:35Z

I think using memory_scope is okay even for extensions, if we write the specification clearly.

One way to interpret the memory_scope argument here is that instead of describing the precise group type, it describes the capabilities of the group and the way that work-items in that group talk to one another. The amount of memory returned by memory_scope::work_group should be suitable for any generic group type representing a subset of work-items in the same work-group, because all of the work-items in such a group could communicate via work-group memory.

The designers of future extensions would then have two options when introducing a new group type:

Require developers to use the next broadest memory_scope (e.g. a subset of work-items in a sub-group should use memory_scope::sub_group, a set of work-items spanning multiple work-groups should use memory_scope::device).
Extend the memory_scope enum with a new scope associated with the new group.

FWIW, I think 1. is cleaner, but we don't have to solve that problem here.

All that said... Are there any other arguments to memory_required that you can think of that might require us to adopt a class design for other reasons?

andreyfe1 · 2021-06-29T16:21:51Z

Are there any other arguments to memory_required that you can think of that might require us to adopt a class design for other reasons?

As I checked. no. We have difference by group or sub-group, joint_ or _over_group, sorter. I don't see more

Pennycook · 2021-06-29T16:29:03Z

As I checked. no. We have difference by group or sub-group, joint_ or _over_group, sorter. I don't see more

Ok, great. Then I think memory_scope is a good solution for now.

andreyfe1 · 2021-06-29T16:31:53Z

Great! I'll try to make a prototype for executor-like objects then we'll see how it looks like. Any suggestions for the name of a type are appreciated

Pennycook · 2021-06-29T17:18:42Z

Great! I'll try to make a prototype for executor-like objects then we'll see how it looks like. Any suggestions for the name of a type are appreciated

The only suggestion I have is to use _executor as a suffix, which would be consistent with the example executors in the executor proposal. group_executor might be a good placeholder name, and we can bikeshed later?

andreyfe1 · 2021-06-30T15:17:48Z

Added "executor-like" object to the Spec and modified functions for temporary memory calculation. Please, take a look how it looks like now. I've added group executors only for algorithms' overloads that doesn't have Sorter parameter since we can pass memory directly to Sorter objects.

akukanov · 2021-07-02T13:09:26Z

Having the exact number and types of parameters for default_sorter methods unspecified defeats the purpose of this class: it's there for portability across implementations/platforms, and you essentially suggest that a part of its definition is implementation specific. That makes no sense to me; so I believe the memory helpers for default_sorter must work with whatever minimal information is considered common.

The problem I see with standalone memory helper functions is that those are proposed specifically for sorting, which seemingly prevents using/overloading those to query memory requirements of other algorithms. The approach with executor-like objects is preferred on the ground of extensibility for other algorithms, so the memory helper functions should be designed with at least the same degree of extensibility.

andreyfe1 · 2021-07-05T12:29:45Z

@akukanov, thanks for a reminder about portability across implementations. However, it's quite difficult to realize how default sorting can be implemented. I'll think about minimal information.

The problem I see with standalone memory helper functions is that those are proposed specifically for sorting

I thought about it. I think names of functions should reflect the algorithm name and should be included into the name, e.g. sort_memory_helper, reduce_memory_helper, ... So, I see it not as overloading, but as different function names. Is it better?

andreyfe1 · 2021-07-05T13:09:59Z

Regarding minimal information we need for memory helpers of default sorter.
Let's assume we implemented default sorter in some abstract way. If it's used some non-specified information then it's implementation details and won't be in public interface. Also default sorter doesn't have anything else non-specified. I'd suggest the following:

static std::size_t memory_required(sycl::memory_scope scope, std::size_t range_size);

template<int dimensions = 1>
static std::size_t memory_required(sycl::memory_scope scope, sycl::range<dimensions> local_range);

local_range is a work group size or sub-group size depending on the scope value.
The same approach is for standalone functions.

akukanov · 2021-07-06T11:01:37Z

I thought about it. I think names of functions should reflect the algorithm name and should be included into the name, e.g. sort_memory_helper, reduce_memory_helper, ... So, I see it not as overloading, but as different function names. Is it better?

It's better, but I would then go with the names that start exactly the same as the algorithm function names, i.e. joint_sort_memory_required, etc.

Also, do we really need external memory for sort_over_group? For it we know the number of elements to sort, so I would instead push for an improvement in device compilers to allow using "max work group size" or another appropriate device-specific constant in a declaration of a local array.

andreyfe1 · 2021-07-06T13:40:57Z

Also, do we really need external memory for sort_over_group?

Unfortunately, yes. The thing is the size of local range for nd_range is a runtime parameter. So, users can pass nd_range{ {100}, {10}} to the kernel and we need to provide memory_required(10) memory size, but not memory_required(<max work group size>) to save local memory.

Regarding names. Having full algorithm name (sort_over_group_memory_required) seems too long for me. I think sort_,,, names looks shorter and range<> or size_t parameter seems good enough to distinguish version for joint_sort and sort_over_group. However, the difference in parameters also can be confusing.
Another thing is if we have such functions for other algorithms, in user's code it will look like:

size_t sort_size   = sort_over_group_memory_required(10);
size_t reduce_size = reduce_over_group_memory_required(10);

The difference is just in 1st word of 5.

@Pennycook, I'd like to hear your opinion on function names.

Pennycook · 2021-07-06T17:22:21Z

Adding the name of the algorithm to the function name seems unnecessary to me. I might be confused, but I thought that the idea was these functions were already part of a class like default_sorter, in which case the inclusion of sort in the name is redundant.

I was imagining that a user would write something like this, for sorting:

size_t scratch_size = default_sorter<int>::memory_required(memory_scope::work_group, work_group_size);
auto scratch = malloc_device(scratch_size, q);
...
auto exec = group_executor(wg, span(scratch, scratch_size));
auto s = sort_over_group(exec, x, default_sorter<int>());

...and for reducing (NB: "reducer" is already used in the spec, so we'd need a different name):

size_t scratch_size = default_reducer<int>::memory_required(memory_scope::work_group, work_group_size);
auto scratch = malloc_device(scratch_size, q);
...
auto exec = group_executor(wg, span(scratch, scratch_size));
auto s = reduce_over_group(exec, x, plus<>(), default_reducer<int>());

I like the overload solution for joint/over_group, but it's worth noting that SYCL 2020 opted to adopt longer names instead of overloading for the algorithms themselves... The fact we don't just have a single sort function with overloads suggests maybe they should be named differently somehow.

Would it make sense to attach the joint or group semantics to the sorter itself? That would break the link (there would be one memory_required function in default_joint_sorter and one in default_group_sorter), and would also enable developers to provide a sorter that only works with one of the options (e.g. a sorter that implements a sorting network may support sort_over_group but not joint_sort).

andreyfe1 · 2021-07-07T09:12:36Z

I thought that the idea was these functions were already part of a class like default_sorter

For sorters, yes. But what if users want to get memory size for sorting functions without sorters? There are standalone functions. See examples: https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R484

Pennycook · 2021-07-07T13:59:54Z

For sorters, yes. But what if users want to get memory size for sorting functions without sorters? There are standalone functions. See examples: https://github.com/intel/llvm/pull/3989/files#diff-5f58a59c39f7c062697cd495199320cc1b8f9d6823834cf61ba37fae5347feb6R484

Oh, I see. Sorry, I'd missed that. I assumed that calling the sort functions without an explicit sorter was functionally equivalent to calling them with a default_sorter object -- in that case, you'd be able to use default_sorter<T>::memory_required.

andreyfe1 · 2021-07-07T14:10:50Z

Won't it be confusing if we recommend to use default_sorter's functions for sort functions without sorter parameters?

Pennycook · 2021-07-07T15:07:19Z

Won't it be confusing if we recommend to use default_sorter's functions for sort functions without sorter parameters?

I think some of this can be solved by documentation. Some of the C++ algorithm overloads are already defined to behave as-if they have default arguments (for example, std::reduce uses std::plus<> if no binary operator is provided). If you wanted to define the sort functions to implicitly use default_sorter I think that would be okay -- the fact that the default_sorter is used by default seems reasonable from the names.

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 · 2021-07-14T16:20:11Z

After the discussion with @rarutyun and @Pennycook we agreed on following:

Get rid of free memory_required functions and use default_sorter::memory_required() calling algorithms without sorters.
Move predefined sorters to the experimental namespace for revisiting it later if there is a better solution for memory_required method arguments.
Making clear connection in the document between memory_required and operator() overloads.
Mark memory_required overloads as constexpr to allow users allocate static local memory.
It's ok to not split predefined sorters by 2 types: using with joint_sort and with sort_over_group.

I'll make corresponding changes soon.

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 · 2021-07-16T09:12:32Z

Feedbacks are addressed.
@intel/dpcpp-specification-reviewers, @rarutyun, @akukanov do you see any more blockers that prevents us from merging and starting to implement?

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 · 2021-07-19T14:58:53Z

Moved radix_order to experimental as well since it looks strange if radix_sorter is in experimental, but radix order is not

sycl/doc/extensions/GroupAlgorithms/SYCL_INTEL_group_sort.asciidoc

andreyfe1 · 2021-07-22T09:40:35Z

It was found that local memory is not enough to process a lot of elements by one work group. So, it's better to let users choose which memory they can use: local or global.

andreyfe1 · 2021-07-22T14:36:30Z

Seems, interfaces are not changing to let users provide global memory
@admitric, do we need a special tag to let IGC know whether global or local memory is passed? Maybe Compiler already adds some tags automatically, e.g. __local, __global.

andreyfe1 · 2021-07-23T10:31:53Z

Recently @akukanov during the offline discussion mentioned that it's better to not associate class, which encapsulates group and memory, with C++ executors. Proposal for C++ executors is not finalized and potentially it can be another thing that we think about it now. I'm trying to imagine a name for the concept and a class. Currently it's "Group helpers". I'm not sure it's a good name. So, feel free to suggest a better name. Some suggestions: group_memory, group_scratchpad, group_memory_helper, group_memory_wrapper,...
What do you think?

Pennycook · 2021-07-23T15:31:46Z

Recently @akukanov during the offline discussion mentioned that it's better to not associate class, which encapsulates group and memory, with C++ executors. Proposal for C++ executors is not finalized and potentially it can be another thing that we think about it now. I'm trying to imagine a name for the concept and a class. Currently it's "Group helpers". I'm not sure it's a good name. So, feel free to suggest a better name. Some suggestions: group_memory, group_scratchpad, group_memory_helper, group_memory_wrapper,...
What do you think?

I'm not opposed to calling the class something else, but I think we should still signal in the extension that we're aware of executors (perhaps by linking to https://wg21.link/p0443) and that our class is trying to achieve something similar. Once executors are finalized, it's highly likely that we'd modify the class introduced here to model the concept or at least try to get very close.

We've talked about how this class might evolve to support additional information beyond just memory, so I think we should still refer to the template argument as something generic like an ExecutionResource. I don't think the names you propose convey that the class is a group with some memory (e.g. group_memory sounds like it's the memory itself); something like group_with_memory or group_with_scratchpad would be slightly more verbose but have more obvious meaning to me.

Tagging @gmlueck, @rolandschulz, @mkinsner, @jbrodman in case they have any ideas or just fancy some bikeshedding.

keryell · 2021-07-27T09:43:24Z

I'm not opposed to calling the class something else, but I think we should still signal in the extension that we're aware of executors (perhaps by linking to wg21.link/p0443) and that our class is trying to achieve something similar. Once executors are finalized, it's highly likely that we'd modify the class introduced here to model the concept or at least try to get very close.

From the ISO C++ SG1+LEWG meeting yesterday, this is more http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2300r1.html which is fashionable nowadays...

Pennycook · 2021-07-27T14:33:47Z

I'm not opposed to calling the class something else, but I think we should still signal in the extension that we're aware of executors (perhaps by linking to wg21.link/p0443) and that our class is trying to achieve something similar. Once executors are finalized, it's highly likely that we'd modify the class introduced here to model the concept or at least try to get very close.

From the ISO C++ SG1+LEWG meeting yesterday, this is more http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2300r1.html which is fashionable nowadays...

Thanks, @keryell.

@capatober, based on this... I think we should describe the first argument as an "execution context" and point to this paper. We can say in the introduction or some other non-normative section that:

Like the paper, we want to allow algorithms to execute in a specific execution context, by coupling a set of work-items (i.e. a group) with other execution resources (e.g. memory).
Unlike the paper, we're not yet concerned with asynchrony or forward progress guarantees, and don't want to complicate our initial proposal by trying to align this class with the "scheduler" concept.
We're closely watching the paper to see how the concepts evolve, with the aim that the ISO C++ and SYCL spellings converge.

What do you think?

rarutyun · 2021-07-28T23:54:17Z

based on this... I think we should describe the first argument as an "execution context" and point to this paper. We can say in the introduction or some other non-normative section that:

I would like to add my 2 cents. Since I am regularly participating in LEWG and also working on the Executors that we probably should not very aggressively react on any change that is going on in C++ standard committee. As far as I can tell P2300 is intended to be the evolution of P0443 but in fact it is not so far. The discussion is still in progress and we don't know for sure what outcome would be. I am saying that we definitely should follow what is going on with Executors (and basically we are doing that with my and other people efforts) but linking to the paper might be undesirable in our context because things are changing quite quickly including the terminology. I think we should operate our terms for Group Sort to achieve what we need keeping in mind Executors to be able to develop as close model to them as possible in future but don't try to use the terminology and the paradigms from there too much.

Like the paper, we want to allow algorithms to execute in a specific execution context, by coupling a set of work-items (i.e. a group) with other execution resources (e.g. memory).

Execution context basically represents the some hardware unit (e.g. CPU, GPU) to my understanding. Could you please elaborate what do you mean by your first bullet?

Unlike the paper, we're not yet concerned with asynchrony or forward progress guarantees, and don't want to complicate our initial proposal by trying to align this class with the "scheduler" concept.

Completely agree from asynchrony standpoint. However, scheduler is the representation of Execution Context so, I would like to see the answer on my previous question to make sure we are on the same page. Anyway, I agree that we should not overcomplicate our initial proposal, especially keeping in mind that Executors are not stable yet.

We're closely watching the paper to see how the concepts evolve, with the aim that the ISO C++ and SYCL spellings converge.

Dead sure.

keryell · 2021-07-29T06:18:19Z

I agree we do not need to track too closely uncommitted ISO C++ discussions.
In the past we adopted early in SYCL some ideas which never came out and disappeared (exception lists...).
But at least we can cite the ISO C++ proposals...

Pennycook · 2021-07-29T13:54:24Z

I would like to add my 2 cents. Since I am regularly participating in LEWG and also working on the Executors that we probably should not very aggressively react on any change that is going on in C++ standard committee. As far as I can tell P2300 is intended to be the evolution of P0443 but in fact it is not so far.

Thanks for clarifying the status of this paper. If you think it's too early to link to it, we shouldn't link to it.

Execution context basically represents the some hardware unit (e.g. CPU, GPU) to my understanding. Could you please elaborate what do you mean by your first bullet?

I may have misunderstood the authors' intent, or be reading too much between the lines. Section 4.1 says:

An execution context is a resource that represents the place where execution will happen. This could be a concrete resource - like a specific thread pool object, or a GPU - or a more abstract one, like the current thread of execution. Execution contexts don’t need to have a representation in code; they are simply a term describing certain properties of execution of a function.

In the case of a "specific resource", something like a CPU or GPU has more properties than the proposal discusses (e.g. a fixed amount of hardware parallelism, some finite amount of memory, etc). So, my thinking is that a "more abstract one" could include other properties too (e.g. the current thread of execution with this memory pool). The proposal seems deliberately vague about what an execution context is or isn't -- since a user must create a scheduler in order to interact with the other functions it introduces -- but there seems to be room to interpret things the way I have.

andreyfe1 · 2021-08-02T08:39:05Z

Basing on that I'd like to clarify which proposal can be for Group Sort extension. I wouldn't suggest to make a strong connection between Executors proposal and class that encapsulates group and memory because (as it's mentioned above) things are changing very quickly. We can duplicate some ideas here that we like and that make sense. Then align with Executors proposal when it's finalized, later. Let's choose a name group_with_scratchpad if nobody is strong against. What about the concept if we don't have to make connections to Executors? What can be a better name? Execution context?

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

bader · 2021-08-27T09:31:15Z

@capatober, could you confirm that I can PR title and description for the commit message title/description?

andreyfe1 · 2021-08-27T09:36:22Z

I think title and description are valid despite the multiple changes here. I don't see what can prevent us from having it for the commit message. Let me know if some additional [tags] for title are needed for the commit message

initial version for additional memory

31714b5

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

andreyfe1 requested a review from a team as a code owner June 24, 2021 15:45

bader added the spec extension All issues/PRs related to extensions specifications label Jun 24, 2021

andreyfe1 added 2 commits June 25, 2021 19:23

added one more memory helper

05b2d2c

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

some refactoring

b5f9f88

added executor-like object

293f994

andreyfe1 added 2 commits July 1, 2021 10:52

added examples

483504e

move group_executor to experimental

b3a70ca

partially addressed some feedbacks

404a27f

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

addressed current feedbacks

a8bcb37

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

moved radix_order to experimental

45e6865

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

Pennycook requested changes Jul 20, 2021

View reviewed changes

andreyfe1 added 2 commits July 21, 2021 13:46

addressed some remarks

be2cc15

removed sentence regarding calling on host

a89e8df

some small things

012d4c0

renamed group executor class

6214730

replaced with group_with_scratchpad

e02465c

Signed-off-by: Fedorov, Andrey <andrey.fedorov@intel.com>

Pennycook approved these changes Aug 26, 2021

View reviewed changes

bader changed the title ~~[DOC] Group sort. Passing temporary local memory to an algorithm~~ [DOC] Extend group sort algorithms with interfaces to scratchpad memory Aug 27, 2021

bader merged commit f57091d into intel:sycl Aug 27, 2021

andreyfe1 mentioned this pull request Nov 3, 2021

[SYCL][DOC] Make some refactoring for group sorting spec #4666

Merged

[DOC] Extend group sort algorithms with interfaces to scratchpad memory #3989

[DOC] Extend group sort algorithms with interfaces to scratchpad memory #3989

Uh oh!

Conversation

andreyfe1 commented Jun 24, 2021

Uh oh!

andreyfe1 commented Jun 24, 2021

Uh oh!

andreyfe1 commented Jun 28, 2021

Uh oh!

Pennycook commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreyfe1 commented Jun 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pennycook commented Jun 29, 2021

Uh oh!

andreyfe1 commented Jun 29, 2021

Uh oh!

Pennycook commented Jun 29, 2021

Uh oh!

andreyfe1 commented Jun 29, 2021

Uh oh!

Pennycook commented Jun 29, 2021

Uh oh!

andreyfe1 commented Jun 29, 2021

Uh oh!

Pennycook commented Jun 29, 2021

Uh oh!

andreyfe1 commented Jun 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Pennycook commented Jun 29, 2021

Uh oh!

andreyfe1 commented Jun 30, 2021

Uh oh!

akukanov commented Jul 2, 2021

Uh oh!

andreyfe1 commented Jul 5, 2021

Uh oh!

andreyfe1 commented Jul 5, 2021

Uh oh!

akukanov commented Jul 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreyfe1 commented Jul 6, 2021

Uh oh!

Pennycook commented Jul 6, 2021

Uh oh!

andreyfe1 commented Jul 7, 2021

Uh oh!

Pennycook commented Jul 7, 2021

Uh oh!

andreyfe1 commented Jul 7, 2021

Uh oh!

Pennycook commented Jul 7, 2021

Uh oh!

andreyfe1 commented Jul 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreyfe1 commented Jul 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreyfe1 commented Jul 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreyfe1 commented Jul 22, 2021

Uh oh!

andreyfe1 commented Jul 22, 2021

Pennycook commented Jun 28, 2021 •

edited

Loading

andreyfe1 commented Jun 29, 2021 •

edited

Loading

andreyfe1 commented Jun 29, 2021 •

edited

Loading

akukanov commented Jul 6, 2021 •

edited

Loading

andreyfe1 commented Jul 14, 2021 •

edited

Loading

andreyfe1 commented Jul 16, 2021 •

edited

Loading

andreyfe1 commented Jul 19, 2021 •

edited

Loading

andreyfe1 commented Jul 23, 2021 •

edited

Loading

bader commented Aug 27, 2021 •

edited

Loading