-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit buffer allocation #24
Comments
You mean having an |
Exactly!
We have |
This is covered, on a different way, on a set of proposals from Codeplay to complete the asynchronous data-flow model in SYCL, the particular about allocation is in but we didn't have enough implementation experience for 1.2.1. The proposal above is synchronous w.r.t the user thread, but allowing it on the command group makes sense as well. Rather than an explicit allocate functionality, I would go for a slightly high level approach: cl::sycl::buffer<cl::sycl::cl_int, 1> buffer(global_range);
myQueue.submit([&](cl::sycl::handler& cgh) {
cgh.require(buffer, /* Optionally, also the access mode */);
}); or, for completeness cl::sycl::buffer<cl::sycl::cl_int, 1> buffer(global_range);
myQueue.submit([&](cl::sycl::handler& cgh) {
auto accA = buffer.get_access<access::mode::read>(cgh);
cgh.require(accA);
}); An explicit allocate function would force the SYCL runtime to actually allocate something on the context. However, the SYCL runtime may already have a mean to access the data in the context, for example, if two devices share the same global address space: context c1(d1, d2);
queue q1{c1, d1};
queue q2{c1, d2};
cl::sycl::buffer<cl::sycl::cl_int, 1> buffer(global_range);
// Allocate object on device1, but same context so visible for d2 as well
q1.submit([&](cl::sycl::handler& cgh) {
cgh.allocate(buffer);
});
// What is the meaning now of this allocate?
// Also, which allocate executes first?
q2.submit([&](cl::sycl::handler& cgh) {
cgh.allocate(buffer);
}); Using a less explicit require will fit with the requirements/actions model we defined in the architecture section: context c1(d1, d2);
queue q1{c1, d1};
queue q2{c1, d2};
cl::sycl::buffer<cl::sycl::cl_int, 1> buffer(global_range);
// Requires buffer to be available on d1, action to perform is, e.g. a clCreateBuffer(MEM_READ_WRITE)
q1.submit([&](cl::sycl::handler& cgh) {
cgh.require(buffer);
});
// Requires buffer to be available on d2, requirement is satisfied, so no action to perform.
q2.submit([&](cl::sycl::handler& cgh) {
cgh.require(buffer);
}); |
Thanks for your detailed answers! The proposal seems interesting.
I don't fully understand this point. But I don't have any strong opinion about the naming convention. I prefer the name If a user can have more control on when the allocation will occur, I'm happy :) |
We are having some discussions internally about this changes.
The copy functions take an external pointer as an argument, so the runtime cannot know if the data has been copied or not, and must copy every time. |
See also https://github.com/illuhad/hipSYCL/blob/develop/doc/extensions.md#hipsycl_ext_update_device for an extension in hipSYCL that provides similar functionality. |
need time to evaluate update_device proposal |
relax multiple commands so no command group with multiple copies |
close to converging |
Hi,
The
handler
class havecopy
andfill
member functions. Can we have anallocate
function? As the name implies, this function will allocate buffers [*]The justification for this demand is the following:
Allocations may include synchronization between kernels. Hence users may want first to allocate buffers, then run their kernels.
This makes reasoning about concurrency easier (as far as I know explicit preemptive allocations is considered good practice in OpenMP world).
[*] It's possible to mimic the features right now by either using directly an
"OpenCL buffer" or by creating and launching a dummy "kernel". Both methods are kinda cumbersome.
Thanks
The text was updated successfully, but these errors were encountered: