-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Add DPC++ RT support for non-native SYCL 2020 spec constants #3589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add DPC++ RT support for non-native SYCL 2020 spec constants #3589
Conversation
This patch adds support of non-native SYCL 2020 specialization constants to DPC++ runtime. Non-native specialization constants emulate the usage of native specialization constants for AOT compilation and CUDA
@alexbatashev, can you please review this feature? |
sycl/test/on-device/basic_tests/specialization_constants/non_native/Inputs/common.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for PI change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
program_manager.cpp LGTM, with one Nit
@@ -1309,7 +1309,8 @@ void ProgramManager::bringSYCLDeviceImagesToState( | |||
break; | |||
} | |||
case bundle_state::executable: | |||
// Device image is already in the desired state. | |||
DevImage = build(DevImage, getSyclObjImpl(DevImage)->get_devices(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I assume this build call is optionally needed to do native device code linking? Why not call to link
then? Please add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build is needed here to create device image which contain spec constants; as device image is in executable state because of AOT, build instead of link (object state) is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@dm-vodopyanov, please, address pre-commit testing failures. |
Pre-commit test failures should be resolved after merging of #3609, which now have some merge conflict. |
8e944f5
… builbot it runs with L0
665ace5
to
2d85e62
Compare
@@ -1,4 +1,4 @@ | |||
// REQUIRES: ocloc, gpu | |||
// REQUIRES: opencl, ocloc, gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did yo filter our Level Zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be built in AOT mode only, so Level Zero should not be here because it don't have such support.
The reason I explicitly added opencl here is that for some reason this test ran on Windows Buildbot for Level Zero only (http://ci.llvm.intel.com:8010/#/builders/18/builds/11898) - but it shouldn't because we have similar classic AOT GPU test here https://github.com/intel/llvm-test-suite/blob/intel/SYCL/AOT/gpu.cpp which doesn't run on Windows on Level Zero. Could be some failure in LIT infra for Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AOT mode only, so Level Zero should not be here because it don't have such support.
Can you elaborate? What exactly do you think is not supported in Level Zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-native spec constants are available currently only when user is compiling code in AOT mode: for OpenCL CPU/GPU/ACC and CUDA. Level Zero doesn't have ahead-of-time compilation support, only JITting, that's why this test for GPU device requires OpenCL only; for GPU device with CUDA backend this PR introduces another separate test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Level Zero doesn't have ahead-of-time compilation support
Do you mean SYCL RT running over Level Zero backend doesn't support AOT?
Is there anything missing in Level Zero itself to support AOT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I had outdated information regarding Level Zero. Now I see that it really has a support of AOT when I compiled some example on Linux with Level Zero in AOT mode. I will remove opencl
from the test and check CI again.
BTW, #3609 patch was merged, so currently checks for Linux and Windows are green in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the test case that this is fixing? (Looks like a big change to me)
@smaslov-intel, could you please elaborate? Not quite understand. |
I am asking specifically about your latest change 022503a Why is this needed, and how is this tested? |
In sycl/source/detail/scheduler/commands.cpp we need to pass a buffer (piMem) containing data about spec constants to kernel arguments of a kernel. Before, I used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
program_manager.cpp LGTM
@@ -57,6 +57,10 @@ handler::getOrInsertHandlerKernelBundle(bool Insert) const { | |||
if (!KernelBundleImpPtr && Insert) { | |||
KernelBundleImpPtr = detail::getSyclObjImpl( | |||
get_kernel_bundle<bundle_state::input>(MQueue->get_context())); | |||
if (KernelBundleImpPtr->empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT. It would be nice to have a comment explaining this logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll submit some comments as a separate PR.
@@ -442,7 +442,11 @@ class kernel_bundle_impl { | |||
return SetInDevImg || MSpecConstValues.count(std::string{SpecName}) != 0; | |||
} | |||
|
|||
const device_image_plain *begin() const { return &MDeviceImages.front(); } | |||
const device_image_plain *begin() const { | |||
assert(!MDeviceImages.empty() && "MDeviceImages can't be empty"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why MDeviceImages can't be empty?
I believe this should behave as std::vector
which has end() == begin()
if empty()
is true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that this is not a valid assert. I'll submit a fix as a separate pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MDeviceImage
can't be empty because MDeviceImages.front()
is UB in case of MDeviceImages.empty() == true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dm-vodopyanov it's UB to access front
, but it doesn't mean, that kernel_bundle must have any device image at all. The spec mentions empty()
member function for kernel_bundle
, which @romanovvlad refers to: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_the_kernel_bundle_class
This patch adds support of non-native SYCL 2020 specialization constants
to DPC++ runtime. Non-native specialization constants emulate the usage
of native specialization constants for AOT compilation and CUDA