Skip to content

Commit 982996d

Browse files
committed
Update after graph proposal has been merged.
Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
1 parent eb38e6c commit 982996d

File tree

1 file changed

+17
-11
lines changed

1 file changed

+17
-11
lines changed

sycl/doc/extensions/proposed/sycl_ext_oneapi_graph_fusion.asciidoc

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ https://github.com/intel/llvm/issues
3838

3939
== Dependencies
4040

41-
This extension is written against the SYCL 2020 revision 6 specification. All
41+
This extension is written against the SYCL 2020 revision 7 specification. All
4242
references below to the "core SYCL specification" or to section numbers in the
4343
SYCL specification refer to that revision.
4444

@@ -79,9 +79,9 @@ recording mechanism, similar to the initial kernel fusion proposal; and another
7979
one using explicit graph building. Thus, future users will be able to choose
8080
from two different mechanisms to construct the sequence of kernels to fuse. As
8181
there is an explicit step for finalization of graphs before being submitted for
82-
execution, the fusion step can happen asynchronously and also eliminates many of
83-
the synchronization concerns that needed to be covered in the experimental
84-
kernel fusion proposal.
82+
execution, fusion can happen in this step, which also eliminates many of the
83+
synchronization concerns that needed to be covered in the experimental kernel
84+
fusion proposal.
8585

8686
The aim of this document is to propose a mechanism for users to request the
8787
fusion of two or more kernels in a SYCL graph into a single kernel **at
@@ -460,29 +460,32 @@ struct AddKernel {
460460

461461
int main() {
462462
constexpr size_t dataSize = 512;
463-
int in1[dataSize], in2[dataSize], in3[dataSize], tmp1[dataSize],
464-
tmp2[dataSize], tmp3[dataSize], out[dataSize];
463+
int in1[dataSize], in2[dataSize], in3[dataSize], out[dataSize];
465464

466465
queue q{default_selector_v};
467466

468-
ext::oneapi::experimental::command_graph graph{q.get_context(),
469-
q.get_device()};
470467
{
471468
buffer<int> bIn1{in1, range{dataSize}};
469+
bIn1.set_write_back(false);
472470
buffer<int> bIn2{in2, range{dataSize}};
471+
bIn2.set_write_back(false);
473472
buffer<int> bIn3{in3, range{dataSize}};
474-
buffer<int> bTmp1{tmp1, range{dataSize}};
473+
bIn3.set_write_back(false);
474+
buffer<int> bTmp1{range{dataSize}};
475475
// Internalization specified on the buffer
476476
buffer<int> bTmp2{
477-
tmp2,
478477
range{dataSize},
479478
{sycl::ext::oneapi::experimental::property::promote_private{}}};
480479
// Internalization specified on the buffer
481480
buffer<int> bTmp3{
482-
tmp3,
483481
range{dataSize},
484482
{sycl::ext::oneapi::experimental::property::promote_private{}}};
485483
buffer<int> bOut{out, range{dataSize}};
484+
bOut.set_write_back(false);
485+
486+
ext::oneapi::experimental::command_graph graph{
487+
q.get_context(), q.get_device(),
488+
sycl::ext::oneapi::experimental::property::graph::no_host_copy{}};
486489

487490
graph.begin_recording(q);
488491

@@ -530,6 +533,8 @@ int main() {
530533
command_graph::perform_fusion});
531534

532535
q.ext_oneapi_graph(exec_graph);
536+
537+
q.wait();
533538
}
534539
return 0;
535540
}
@@ -635,4 +640,5 @@ Ewan Crawford, Codeplay +
635640
|1|2023-02-16|Lukas Sommer|*Initial draft*
636641
|2|2023-03-16|Lukas Sommer|*Remove reference to outdated `add_malloc_device` API*
637642
|3|2023-04-11|Lukas Sommer|*Update usage examples for graph API changes*
643+
|4|2023-08-17|Lukas Sommer|*Update after graph extension has been merged*
638644
|========================================

0 commit comments

Comments
 (0)