Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][USM] Improve USM allocator test and fix improper behavior. #1538

Merged
merged 2 commits into from
May 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions sycl/include/CL/sycl/usm/usm_allocator.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,8 +87,7 @@ class usm_allocator {
usm::alloc AllocT = AllocKind,
typename std::enable_if<AllocT == usm::alloc::device, int>::type = 0>
void destroy(pointer Ptr) {
throw feature_not_supported(
"Device pointers do not support destroy on host", PI_INVALID_OPERATION);
// This method must be a NOP for device pointers.
bader marked this conversation as resolved.
Show resolved Hide resolved
}

/// Note:: AllocKind == alloc::device is not allowed.
Expand Down
104 changes: 85 additions & 19 deletions sycl/test/usm/allocator_vector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,35 +30,101 @@ int main() {
auto dev = q.get_device();
auto ctxt = q.get_context();

if (!dev.get_info<info::device::usm_host_allocations>())
return 0;
if (dev.get_info<info::device::usm_host_allocations>()) {
usm_allocator<int, usm::alloc::host> alloc(ctxt, dev);

usm_allocator<int, usm::alloc::host> alloc(ctxt, dev);
std::vector<int, decltype(alloc)> vec(alloc);
vec.resize(N);

std::vector<int, decltype(alloc)> vec(alloc);
vec.resize(N);
for (int i = 0; i < N; i++) {
vec[i] = i;
}

for (int i = 0; i < N; i++) {
vec[i] = i;
int *res = &vec[0];
int *vals = &vec[0];

auto e1 = q.submit([=](handler &h) {
h.single_task<class foo>([=]() {
for (int i = 1; i < N; i++) {
res[0] += vals[i];
}
});
});

e1.wait();

int answer = (N * (N - 1)) / 2;

if (vec[0] != answer)
return -1;
}

if (dev.get_info<info::device::usm_shared_allocations>()) {
usm_allocator<int, usm::alloc::shared> alloc(ctxt, dev);

std::vector<int, decltype(alloc)> vec(alloc);
vec.resize(N);

for (int i = 0; i < N; i++) {
vec[i] = i;
}

int *res = &vec[0];
int *vals = &vec[0];

auto e1 = q.submit([=](handler &h) {
h.single_task<class bar>([=]() {
for (int i = 1; i < N; i++) {
res[0] += vals[i];
}
});
});

e1.wait();

int answer = (N * (N - 1)) / 2;

if (vec[0] != answer)
return -1;
}

int *res = &vec[0];
int *vals = &vec[0];
if (dev.get_info<info::device::usm_device_allocations>()) {
usm_allocator<int, usm::alloc::device> alloc(ctxt, dev);

auto e1 = q.submit([=](handler &cgh) {
cgh.single_task<class foo>([=]() {
for (int i = 1; i < N; i++) {
res[0] += vals[i];
}
std::vector<int, decltype(alloc)> vec(alloc);
vec.resize(N);

int *res = &vec[0];
int *vals = &vec[0];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not int *res = vec.data()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason.


auto e0 = q.submit([=](handler &h) {
h.single_task<class baz_init>([=]() {
bader marked this conversation as resolved.
Show resolved Hide resolved
res[0] = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused by this, res and vals point to the same device vector data, so res[0] = 0 is redundant to the first loop iteration below where vals[i=0] = 0. This also means the res[0] == vals[0] is added to itself an extra time, which works because is happens to be 0, but again confusing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see i=0 is skipped in the second loop so no double add. Anyway I know it's just a test, but having res and vals point at the same memory is still confusing to me, takes a minute to verify that it's not a problem in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The computation is sort of irrelevant.

for (int i = 0; i < N; i++) {
vals[i] = i;
}
});
});
});

e1.wait();
auto e1 = q.submit([=](handler &h) {
h.depends_on(e0);
h.single_task<class baz>([=]() {
for (int i = 1; i < N; i++) {
res[0] += vals[i];
}
});
});

int answer = (N * (N - 1)) / 2;
e1.wait();

if (vec[0] != answer)
return -1;
int answer = (N * (N - 1)) / 2;
int result;
q.memcpy(&result, res, sizeof(int));
q.wait();

if (result != answer)
return -1;
}

return 0;
}