-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Add DPC++ RT support for SYCL 2020 spec constants (part 1) #3382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add DPC++ RT support for SYCL 2020 spec constants (part 1) #3382
Conversation
This patch adds partial implementation of specialization constants in DPC++ RT: 1. Implementation of `specialization_id` class 2. Implementation of `kernel_handler` class 3. Support for user's device lambdas which take `kernel_handler` argument
T getSpecializationConstantOnDevice() { | ||
const char *SymbolicID = __builtin_unique_stable_name( | ||
detail::specialization_id_name_generator<S>); | ||
return __sycl_getScalar2020SpecConstantValue<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T getSpecializationConstantOnDevice() { | ||
const char *SymbolicID = __builtin_unique_stable_name( | ||
detail::specialization_id_name_generator<S>); | ||
return __sycl_getComposite2020SpecConstantValue<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl/include/CL/sycl/handler.hpp
Outdated
|
||
// TODO: replace detail::RangeRoundedLambda with | ||
// "constexpr if" when DPC++ RT switched to C++17 | ||
detail::RangeRoundedLambda<TransformedArgType, KernelType, Dims> Wrapper{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdeodhar, please review from range rounding point-of-view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function containing the range rounding code, "void parallel_for_lambda_impl(range NumWorkItems, KernelType KernelFunc) {...}" is called with a KernelFunc.
KernelFunc may have a kernel_handler parameter. If it does, the wrapped kernel should be declared with that additional parameter, (say, KH), and the body of the wrapper should pass it to the original kernel, as KernelFunc(Arg, KH);
I'm not sure how you'll do it.
Ran post-commit locally on Linux with #3345 and #3353 patches: no failures during build and check-sycl:
|
Ran post-commit locally on Windows with #3345 and #3353 patches: no failures during build, and one unrelated to this patch failure in
|
Ping @ everyone |
}; | ||
|
||
template <typename F, typename RetT, typename... Args> | ||
struct check_fn_signature<F, RetT(Args...)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from handler.hpp
as is, except check_kernel_lambda_takes_args
- this is a new type trait
} | ||
MKernel(ID); | ||
}); | ||
detail::NDLoop<Dims>::iterate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-format, the only change - runKernelWithArg... line
|
||
#ifdef __SYCL_DEVICE_ONLY__ | ||
template <auto &S, typename T = std::remove_reference_t<decltype(S)>, | ||
std::enable_if_t<std::is_scalar_v<T>>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use is_fundamental here?
@AlexeySachkov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made is_fundamental. @AlexeySachkov, can you please review if is_fundamental
and is_compaund
are correct type traits here?
sycl/test/on-device/basic_tests/specialization_constants/kernel_handler.cpp
Outdated
Show resolved
Hide resolved
// additional kernel_handler argument. | ||
|
||
// NOTE: to support kernel_handler argument in kernel lambdas, only | ||
// kernel_***_wrapper functions must be called in this code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition - only kernel_***_wrapper functions must be called in this code
- does not hold. kernel_handler
constructor is also called. Can it be a problem if the constructor becomes non-trivial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition - only kernel_***_wrapper functions must be called in this code - does not hold
Ca you please elaborate on this a bit? What I meant by this is that we shouldn't call, e.g., kernel_parallel_for
in our headers anywhere because we need some logic above it to decide what version of kernel_parallel_for
to run - with or w/o kernel_handler
support. And we need some place to initialize kernel_handler
obj on our side. In these cases kernel_parallel_for_wrapper
acts as this place.
Can it be a problem if the constructor becomes non-trivial?
It can if something will be changed in spec, in this case initialization of kernel_handler
object should be re-written here. Right now we have separate init function __init_specialization_constants_buffer
which is called in FE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see. This is not quite clear from the comments - I'd suggest to reword for better clarity.
My guess was that this was related more to parallel_for_work_group, where variables local to a kernel become WG-shared (address space 3 global in LLVM IR), and there is code filtering for the WG leader
work item.
// This guard is needed because the libsycl.so can compiled with C++ <=14 | ||
// while the code requires C++17. This code is not supposed to be used by the | ||
// libsycl.so so it should not be a problem. | ||
#if __cplusplus > 201402L |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: spec const 2020 design is not settled yet, so this code may need to be changed/removed in future. + @AlexeySachkov, @gmlueck for possible comments.
...est/on-device/basic_tests/specialization_constants/kernel_lambda_with_kernel_handler_arg.cpp
Show resolved
Hide resolved
Jenkins/Precommit failure: tests compile with c++14, removed the flag and created PR: intel/llvm-test-suite#202 |
This patch adds partial implementation of specialization constants in DPC++ RT:
specialization_id
classkernel_handler
classkernel_handler
argument