Skip to content

[SYCL][CUDA] add non-uniform groups and algorithms support for ext_oneapi_cuda #9182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from

Conversation

JackAKirk
Copy link
Contributor

@JackAKirk JackAKirk commented Apr 24, 2023

This PR adds cuda support for fixed_size_group, ballot_group, and opportunistic_group. All group algorithm support specified in the extension document is also added, except for inclusive_scan and exclusive_scan.

Status summary

  • Still in draft
  • All implemented algorithms have been tested (but haven't added tests yet in the PR)
  • All implementation questions for the cuda backend with respect to the extension spec are now I think resolved
  • some implementation remaining (see below). Some small impl questions like where to place masked_redux.hpp can be resolved later.

TODO:

  • Add tests.
  • Add scan implementations (this can perhaps be added in a follow up PR).
  • Should add exceptions for as yet unimplemented tangle_group cases.
  • Resolve some small remaining portability impl issues e.g. that some algorithms like GroupAll for fixed_size_group and ballot_group need different impls in intel backend (see https://github.com/intel/llvm/pull/9181/files) whereas in cuda backend both group types can use the same impl. I can simple adsorb the cuda impls into a cuda::GroupAll and call this from the appropriate group specific spirv::GroupAll specialization.

Pennycook and others added 18 commits March 27, 2023 10:52
To avoid duplicating logic and introducing even more overloads of the group
algorithms, it is desirable to move some of the implementation details into
the detail::spirv namespace.

This commit makes a few changes to enable that to happen:

- spirv:: functions with a Group template now take a group object, to enable
  run-time information (e.g. group membership) to pass through.

- ControlBarrier and the OpGroup* instruction used to implement reduce/scan
  now forward to spirv::, similar to other group functions and algorithms.

- The calc helper used to map functors to SPIR-V instructions is updated to
  use the new spirv:: functions, instead of calling __spirv intrinsics.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Nested detail namespaces cause problems for name lookup.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Enables the following functions to be used with ballot_group arguments:
- group_barrier
- group_broadcast
- any_of_group
- all_of_group
- none_of_group
- reduce_over_group
- exclusive_scan_over_group
- inclusive_scan_over_group

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Tests the ability to create an instance of each new group type,
and the correctness of the core member functions.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
This commit adds tests for using ballot_group and the following algorithms:
- group_barrier
- group_broadcast
- any_of_group
- all_of_group
- none_of_group
- reduce_over_group
- exclusive_scan_over_group
- inclusive_scan_over_group

Signed-off-by: John Pennycook <john.pennycook@intel.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>

cluster/ballot/opportunistic_group cuda support.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Works for all non uniform groups for int type.
Fixed cluster_group full mask bug.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
opportunistic_group/ballot_group still missing shfl based impl.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Some formatting.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk temporarily deployed to aws April 24, 2023 19:02 — with GitHub Actions Inactive
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk temporarily deployed to aws April 27, 2023 19:04 — with GitHub Actions Inactive
JackAKirk added 4 commits May 2, 2023 10:27
draft scan impl

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk
Copy link
Contributor Author

I'm closing this draft impl. I will open a new PR from the finalised branch.

@JackAKirk JackAKirk closed this May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants