Skip to content

[SYCL] Add DPC++ RT support for non-native SYCL 2020 spec constants #3589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dm-vodopyanov
Copy link
Contributor

This patch adds support of non-native SYCL 2020 specialization constants
to DPC++ runtime. Non-native specialization constants emulate the usage
of native specialization constants for AOT compilation and CUDA

This patch adds support of non-native SYCL 2020 specialization constants
to DPC++ runtime. Non-native specialization constants emulate the usage
of native specialization constants for AOT compilation and CUDA
@dm-vodopyanov dm-vodopyanov requested review from kbobrovs and a team as code owners April 21, 2021 14:03
@dm-vodopyanov dm-vodopyanov requested a review from rbegam April 21, 2021 14:03
@dm-vodopyanov
Copy link
Contributor Author

dm-vodopyanov commented Apr 21, 2021

This patch depends on #3561 and #3609. For now tested locally on Linux with workarounds and tests passed.

@dm-vodopyanov
Copy link
Contributor Author

@kbobrovs , @rbegam , please review.

@dm-vodopyanov
Copy link
Contributor Author

@alexbatashev, can you please review this feature?

smaslov-intel
smaslov-intel previously approved these changes Apr 26, 2021
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for PI change

kbobrovs
kbobrovs previously approved these changes Apr 26, 2021
Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

program_manager.cpp LGTM, with one Nit

@@ -1309,7 +1309,8 @@ void ProgramManager::bringSYCLDeviceImagesToState(
break;
}
case bundle_state::executable:
// Device image is already in the desired state.
DevImage = build(DevImage, getSyclObjImpl(DevImage)->get_devices(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I assume this build call is optionally needed to do native device code linking? Why not call to link then? Please add a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build is needed here to create device image which contain spec constants; as device image is in executable state because of AOT, build instead of link (object state) is used.

alexbatashev
alexbatashev previously approved these changes Apr 30, 2021
Copy link
Contributor

@alexbatashev alexbatashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bader
Copy link
Contributor

bader commented Apr 30, 2021

@dm-vodopyanov, please, address pre-commit testing failures.

@dm-vodopyanov
Copy link
Contributor Author

@dm-vodopyanov, please, address pre-commit testing failures.

Pre-commit test failures should be resolved after merging of #3609, which now have some merge conflict.

smaslov-intel
smaslov-intel previously approved these changes Apr 30, 2021
@@ -1,4 +1,4 @@
// REQUIRES: ocloc, gpu
// REQUIRES: opencl, ocloc, gpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did yo filter our Level Zero?

Copy link
Contributor Author

@dm-vodopyanov dm-vodopyanov Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be built in AOT mode only, so Level Zero should not be here because it don't have such support.
The reason I explicitly added opencl here is that for some reason this test ran on Windows Buildbot for Level Zero only (http://ci.llvm.intel.com:8010/#/builders/18/builds/11898) - but it shouldn't because we have similar classic AOT GPU test here https://github.com/intel/llvm-test-suite/blob/intel/SYCL/AOT/gpu.cpp which doesn't run on Windows on Level Zero. Could be some failure in LIT infra for Windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AOT mode only, so Level Zero should not be here because it don't have such support.

Can you elaborate? What exactly do you think is not supported in Level Zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-native spec constants are available currently only when user is compiling code in AOT mode: for OpenCL CPU/GPU/ACC and CUDA. Level Zero doesn't have ahead-of-time compilation support, only JITting, that's why this test for GPU device requires OpenCL only; for GPU device with CUDA backend this PR introduces another separate test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Level Zero doesn't have ahead-of-time compilation support

Do you mean SYCL RT running over Level Zero backend doesn't support AOT?
Is there anything missing in Level Zero itself to support AOT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had outdated information regarding Level Zero. Now I see that it really has a support of AOT when I compiled some example on Linux with Level Zero in AOT mode. I will remove opencl from the test and check CI again.

BTW, #3609 patch was merged, so currently checks for Linux and Windows are green in CI.

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the test case that this is fixing? (Looks like a big change to me)

@dm-vodopyanov
Copy link
Contributor Author

What is the test case that this is fixing? (Looks like a big change to me)

@smaslov-intel, could you please elaborate? Not quite understand.

@smaslov-intel
Copy link
Contributor

What is the test case that this is fixing? (Looks like a big change to me)

@smaslov-intel, could you please elaborate? Not quite understand.

I am asking specifically about your latest change 022503a

Why is this needed, and how is this tested?

@dm-vodopyanov
Copy link
Contributor Author

Why is this needed, and how is this tested?

In sycl/source/detail/scheduler/commands.cpp we need to pass a buffer (piMem) containing data about spec constants to kernel arguments of a kernel. Before, I used piKernelSetArg func because it suits OpenCL abd CUDA (it accepts piMem). For Level Zero, piKernelSetArg has different interface and doesn't suit our needs (like accepting piMem), so for that, there is a function piextKernelSetArgMemObj which accepts piMem for Level Zero, and just calls piKernelSetArg inside for OpenCL and CUDA.
After changing to piextKernelSetArgMemObj, non-native/gpu.cpp test passed - it runs on Level Zero too.

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

program_manager.cpp LGTM

@@ -57,6 +57,10 @@ handler::getOrInsertHandlerKernelBundle(bool Insert) const {
if (!KernelBundleImpPtr && Insert) {
KernelBundleImpPtr = detail::getSyclObjImpl(
get_kernel_bundle<bundle_state::input>(MQueue->get_context()));
if (KernelBundleImpPtr->empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT. It would be nice to have a comment explaining this logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll submit some comments as a separate PR.

@@ -442,7 +442,11 @@ class kernel_bundle_impl {
return SetInDevImg || MSpecConstValues.count(std::string{SpecName}) != 0;
}

const device_image_plain *begin() const { return &MDeviceImages.front(); }
const device_image_plain *begin() const {
assert(!MDeviceImages.empty() && "MDeviceImages can't be empty");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify why MDeviceImages can't be empty?
I believe this should behave as std::vector which has end() == begin() if empty() is true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that this is not a valid assert. I'll submit a fix as a separate pull request.

Copy link
Contributor Author

@dm-vodopyanov dm-vodopyanov May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MDeviceImage can't be empty because MDeviceImages.front() is UB in case of MDeviceImages.empty() == true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dm-vodopyanov it's UB to access front, but it doesn't mean, that kernel_bundle must have any device image at all. The spec mentions empty() member function for kernel_bundle, which @romanovvlad refers to: https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_the_kernel_bundle_class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants