-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Host task implementation #1471
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Ruyk
previously approved these changes
May 15, 2020
romanovvlad
requested changes
May 18, 2020
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
romanovvlad
approved these changes
May 18, 2020
@keryell Somehow, I can't re-request review with github's UI, so, please, review and approve. |
3 tasks
pvchupin
approved these changes
May 19, 2020
@s-kanaev Please, provide a final commit message for this PR. |
Done. |
alexbatashev
pushed a commit
to alexbatashev/llvm
that referenced
this pull request
May 20, 2020
* sycl: (65 commits) [SYCL] Host task implementation (intel#1471) [SYCL] Update getting dependencies documentation (intel#1699) [SYCL] Fix types and transparent functors recognition in reduction (intel#1709) [SYCL][Doc] Get started guide clean-up (intel#1697) Add --spirv-fp-contract={on|off|fast} option (intel#509) [SYCL][Doc] Fix tbb target path in Get Started Guide. (intel#1695) [SYCL] Add support for kernel name types templated using enums. (intel#1675) [Driver][SYCL] Make -std=c++17 the default for DPC++ (intel#1662) AllocaInst should store Align instead of MaybeAlign. [X86] Replace selectScalarSSELoad ComplexPattern with PatFrags to handle the 3 types of loads we currently match. Harden IR and bitcode parsers against infinite size types. Revert "[nfc] test commit" [nfc] test commit Expose IRGen API to add the default IR attributes to a function definition. The release notes for ObjCBreakBeforeNestedBlockParam was placed between the release note for IndentCaseBlocks and its example code [VectorCombine] forward walk through instructions to improve chaining of transforms [PhaseOrdering] add vector reduction tests; NFC [InstCombine] Clean up alignment handling (NFC) [ARM] Patterns for VQSHRN [VectorCombine] add reduction-like patterns; NFC ...
Ruyk
referenced
this pull request
in codeplaysoftware/sycl-for-cuda
May 21, 2020
Since introduction of host tasks in #1471, `piEventCallback` and related functionality is not required by the SYCL-RT. Removing the implementation of this behaviour from the CUDA backend simplifies the submission of operations to streams and overall increases performance. Signed-off-by: Ruyman Reyes <ruyman@codeplay.com>
bader
pushed a commit
that referenced
this pull request
May 29, 2020
Since introduction of host tasks in #1471, `piEventCallback` and related functionality is not required by the SYCL-RT. Removing the implementation of this behaviour from the CUDA backend simplifies the submission of operations to streams and overall increases performance. Signed-off-by: Ruyman Reyes <ruyman@codeplay.com>
s-kanaev
commented
Jun 25, 2020
Comment on lines
+45
to
+47
auto AccA = BufA.get_access<sycl::access::mode::read>(CGH); | ||
auto AccB = BufB.get_access<sycl::access::mode::read>(CGH); | ||
auto AccC = BufC.get_access<sycl::access::mode::read_write>(CGH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to have host accessor, not device-one.
bader
pushed a commit
that referenced
this pull request
Aug 3, 2020
The tests provided here are for the fix introduced with #1471. Use case fixed by the fix. Imagine us having two device queues (`Q1` and `Q2`) and a single buffer `B`. We initialize the buffer with host accessor. Also lets have two kernels: kernel `K1` executed at `Q1`; and kernel `K2` which executes at `Q2`. `K1` writes to `B`, `K2` reads from `B`. After submitting `K1` to `Q1` there will also be an `AllocaCommand` `A1` which allocates buffer on device. `K1` depends on `A1` via memory object `B`. After submitting `K2` to `Q2` there will be another `AllocaCommand` `A2` (for queue `Q2` and its device). `K2` will depend on `A2`. `A2`, however, should depend on both `A1` and `K1`. `A2->K1` dependency eliminates data race. Signed-off-by: Sergey Kanaev <sergey.kanaev@intel.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch is part 1 in a series of patches for host-interop-task proposal by
Codeplay. See the proposal at [1].
This patch implements:
This patch reimplements glue/connection of events within different contexts to
eliminate use of event callback in favor of host-task.
Host-task execution mechanism involves:
synchronous events;
Thread pool's size is set via
SYCL_QUEUE_THREAD_POOL_SIZE
environment variableand defaults to 1.
Even though host-task is enqueued to device queue it'll be executed on the
default host queue.
Host-task is represented via distinct ExecCGCommand paired with EmptyCommand.
Any other command, which depends on host-task will really depend on it's
EmptyCommand. The EmptyCommand is in blocked state initially.
Class DispatchHostTask awaits for host-task's dependency events, then calls to
host-task's user lambda and unblocks any dependent commands via unblocking it's
EmptyCommand and enqueueing of leaves for requirements (i.e. host accessors
required for execution of this host-task).
[1] https://github.com/codeplaysoftware/standards-proposals/blob/master/host_task/host_task.md