-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Improve parallel_for error handling #2117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
eda30ad
[SYCL] Improve parallel_for error handling
gmlueck fb7308e
Address code review comments
gmlueck d83afdd
More code review comments
gmlueck a20542d
Merge branch 'sycl' into gmlueck/CMPLRLLVM-19405
gmlueck c577689
Disable new reqd_work_group_size test on CUDA
gmlueck 1570e4f
Fix clang-format errors
gmlueck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
// XFAIL: cuda || level0 | ||
// CUDA exposes broken hierarchical parallelism. | ||
// LEVEL0 crashes with many of the negative tests. | ||
|
||
// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -o %t.out | ||
// RUN: %CPU_RUN_PLACEHOLDER %t.out | ||
|
@@ -28,6 +29,16 @@ int main() { | |
|
||
string_class DeviceVendorName = D.get_info<info::device::vendor>(); | ||
auto DeviceType = D.get_info<info::device::device_type>(); | ||
bool IsOpenCL = (D.get_platform().get_backend() == backend::opencl); | ||
|
||
string_class OCLVersionStr; | ||
const char *OCLVersion = nullptr; | ||
if (IsOpenCL) { | ||
OCLVersionStr = D.get_info<info::device::version>(); | ||
assert(OCLVersionStr.size() >= 10 && | ||
"Unexpected device version string"); // strlen("OpenCL X.Y") | ||
OCLVersion = &OCLVersionStr[7]; // strlen("OpenCL ") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not understand this comment |
||
} | ||
|
||
// parallel_for, (16, 16, 16) global, (8, 8, 8) local, reqd_wg_size(4, 4, 4) | ||
// -> fail | ||
|
@@ -61,41 +72,70 @@ int main() { | |
return 1; | ||
} | ||
|
||
string_class OCLVersionStr = D.get_info<info::device::version>(); | ||
assert(OCLVersionStr.size() >= 10 && | ||
"Unexpected device version string"); // strlen("OpenCL X.Y") | ||
const char *OCLVersion = &OCLVersionStr[7]; // strlen("OpenCL ") | ||
if (OCLVersion[0] == '1' || (OCLVersion[0] == '2' && OCLVersion[2] == '0')) { | ||
if (IsOpenCL && (OCLVersion[0] == '1' || | ||
(OCLVersion[0] == '2' && OCLVersion[2] == '0'))) { | ||
// OpenCL 1.x or 2.0 | ||
// parallel_for, (16, 16, 16) global, null local, reqd_wg_size(4, 4, 4) // | ||
// -> fail | ||
try { | ||
Q.submit([&](handler &CGH) { | ||
CGH.parallel_for<class ReqdWGSizeNegativeB>( | ||
CGH.parallel_for<class ReqdWGSizeNoLocalNegative>( | ||
range<3>(16, 16, 16), [=](item<3>) { reqd_wg_size_helper(); }); | ||
}); | ||
Q.wait_and_throw(); | ||
std::cerr | ||
<< "Test case ReqdWGSizeNegativeB failed: no exception has been " | ||
"thrown\n"; | ||
<< "Test case ReqdWGSizeNoLocalNegative failed: no exception has " | ||
"been thrown\n"; | ||
return 1; // We shouldn't be here, exception is expected | ||
} catch (nd_range_error &E) { | ||
if (string_class(E.what()).find( | ||
"OpenCL 1.x and 2.0 requires to pass local size argument even if " | ||
"required work-group size was specified in the program source") == | ||
string_class::npos) { | ||
std::cerr | ||
<< "Test case ReqdWGSizeNegativeB failed: unexpected exception: " | ||
<< "Test case ReqdWGSizeNoLocalNegative failed: unexpected " | ||
"nd_range_error exception: " | ||
<< E.what() << std::endl; | ||
return 1; | ||
} | ||
} catch (runtime_error &E) { | ||
std::cerr | ||
<< "Test case ReqdWGSizeNegativeB failed: unexpected exception: " | ||
<< "Test case ReqdWGSizeNoLocalNegative failed: unexpected " | ||
"runtime_error exception: " | ||
<< E.what() << std::endl; | ||
return 1; | ||
} catch (...) { | ||
std::cerr << "Test case ReqdWGSizeNegativeB failed: something unexpected " | ||
"has been caught" | ||
std::cerr << "Test case ReqdWGSizeNoLocalNegative failed: something " | ||
"unexpected has been caught" | ||
<< std::endl; | ||
return 1; | ||
} | ||
} else if (IsOpenCL) { | ||
// TODO: The behavior when OpenCL > 2.0 needs to be investigated. This | ||
// seems to fail differently than the case when OpenCL is 1.x or 2.0. | ||
} else { | ||
// Backends other than OpenCL | ||
// parallel_for, (16, 16, 16) global, null local, reqd_wg_size(4, 4, 4) | ||
// -> pass | ||
try { | ||
Q.submit([&](handler &CGH) { | ||
CGH.parallel_for<class ReqdWGSizeNoLocalPositive>( | ||
range<3>(16, 16, 16), [=](item<3>) { reqd_wg_size_helper(); }); | ||
}); | ||
Q.wait_and_throw(); | ||
} catch (nd_range_error &E) { | ||
std::cerr << "Test case ReqdWGSizeNoLocalPositive failed: unexpected " | ||
"nd_range_error exception: " | ||
<< E.what() << std::endl; | ||
return 1; | ||
} catch (runtime_error &E) { | ||
std::cerr << "Test case ReqdWGSizeNoLocalPositive failed: unexpected " | ||
"runtime_error exception: " | ||
<< E.what() << std::endl; | ||
return 1; | ||
} catch (...) { | ||
std::cerr << "Test case ReqdWGSizeNoLocalPositive failed: something " | ||
"unexpected has been caught" | ||
<< std::endl; | ||
return 1; | ||
} | ||
|
@@ -127,7 +167,7 @@ int main() { | |
return 1; | ||
} | ||
|
||
if (OCLVersion[0] == '1') { | ||
if (IsOpenCL && OCLVersion[0] == '1') { | ||
// OpenCL 1.x | ||
|
||
// CL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and | ||
|
@@ -283,7 +323,7 @@ int main() { | |
|
||
// CL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and the | ||
// total number of work-items in the work-group computed as | ||
// local_work_size[0] * ... * local_work_size[work_dim – 1] is greater | ||
// local_work_size[0] * ... * local_work_size[work_dim - 1] is greater | ||
// than the value specified by CL_DEVICE_MAX_WORK_GROUP_SIZE in | ||
// table 4.3 | ||
size_t MaxDeviceWGSize = D.get_info<info::device::max_work_group_size>(); | ||
|
@@ -317,13 +357,13 @@ int main() { | |
<< std::endl; | ||
return 1; | ||
} | ||
} else if (OCLVersion[0] == '2') { | ||
} else if (IsOpenCL && OCLVersion[0] == '2') { | ||
// OpenCL 2.x | ||
|
||
// OpenCL 2.x: | ||
// CL_INVALID_WORK_GROUP_SIZE if local_work_size is specified and the | ||
// total number of work-items in the work-group computed as | ||
// local_work_size[0] * ... * local_work_size[work_dim – 1] is greater | ||
// local_work_size[0] * ... * local_work_size[work_dim - 1] is greater | ||
// than the value specified by CL_KERNEL_WORK_GROUP_SIZE in table 5.21. | ||
{ | ||
program P(Q.get_context()); | ||
|
@@ -800,6 +840,8 @@ int main() { | |
return 1; | ||
} | ||
} | ||
} else { | ||
// TODO: Add tests for other backends | ||
} | ||
|
||
// local size has a 0-based range -- no SIGFPEs, we hope | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First this comment is unclear since we do not know that it refers to 7.
OpenClVerStr.assign(OclVerSize, '\0');
: my eyes are bleeding... :-)I wonder whether it should be
OclVerSize + 1
.And why a
std::string
at the first place instead of a local array?Perhaps this kind of information should be cached in the
queue
or at least thedevice
?Perhaps the PI interface could be user-friendly?
While writing assembly code,
OpenClVer = OpenClVerStr + 7;
is simpler.More seriously this is a motivation to move to more modern
std::string_view
in the future...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are good comments. I changed the OpenCL version checking here and in the other file to use the
string
class in a more normal way. I don't think we want to cache this information as that would spread OpenCL knowledge to even more parts of the code. Also, this is an error handling path, so performance isn't so critical.P.S.: Sorry about your eyes. :-)