Description
Hi,
Since #5496 I have segmentation faults when using sycl::get_kernel_bundle
.
The following code works as expected:
#include <sycl/sycl.hpp>
template<typename KernelName> static inline size_t max_work_groups_for_kernel(sycl::queue q) {
size_t max_items = std::max(1U, std::min(4096U, static_cast<uint32_t>(q.get_device().get_info<sycl::info::device::max_work_group_size>())));
#if defined(SYCL_IMPLEMENTATION_INTEL) || defined(SYCL_IMPLEMENTATION_ONEAPI)
try {
sycl::kernel_id id = sycl::get_kernel_id<KernelName>();
auto kernel = sycl::get_kernel_bundle<sycl::bundle_state::executable>(q.get_context()).get_kernel(id);
max_items = std::min(max_items, kernel.get_info<sycl::info::kernel_device_specific::work_group_size>(q.get_device()));
} catch (std::exception& e) {
std::cout << "Couldn't read kernel properties for device: " << q.get_device().get_info<sycl::info::device::name>() << " got exception: " << e.what() << std::endl;
}
#endif
return max_items;
}
class my_kernel;
int main() {
sycl::queue q{};
q.parallel_for<my_kernel>(sycl::range(100), [=](sycl::item<1> it) { (void) it.get_id(); }).wait();
std::cout << max_work_groups_for_kernel<my_kernel>(q) << std::endl;
}
However when the same code is used in another project (https://github.com/Michoumichmich/MolecularDynamics), it systematically fails. That project builds several backends as shared libraries. They all depends on SYCL. As discussed here: KhronosGroup/SYCL-Docs#209, each "kernel query" is (properly?) compiled in the same translation unit as where the kernels are defined (and shared libraries are compiled with -fvisibility=hidden
) and everything works fine (on the "commercial" version of oneAPI too). Since 4817b3f using the code above "somewhere in the main.cpp" or a shared library systematically results in a segfault here:
To Reproduce
To build and run the reproducer:
git clone https://github.com/Michoumichmich/MolecularDynamics -b sycl_kernel_bundle_segfault_reproducer
cd MolecularDynamics && mkdir -p build && cd build
CXX=clang++ cmake .. -DSYCL=CPU && cmake --build . main
SYCL_DEVICE_FILTER=cpu ./main
It seems to affect at least the CPU and CUDA backends
And the backtrace:
(gdb) bt
#0 0x00007ffff6dfbe3a in cl::sycl::detail::ProgramManager::getSYCLDeviceImagesWithCompatibleState(cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&, cl::sycl::bundle_state, std::vector<cl::sycl::kernel_id, std::allocator<cl::sycl::kernel_id> > const&) () from /home/michel/sycl_workspace/deploy/lib/libsycl.so.5
#1 0x00007ffff6e00419 in cl::sycl::detail::ProgramManager::getSYCLDeviceImages(cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&, cl::sycl::bundle_state) ()
from /home/michel/sycl_workspace/deploy/lib/libsycl.so.5
#2 0x00007ffff6e8e8df in cl::sycl::detail::kernel_bundle_impl::kernel_bundle_impl(cl::sycl::context, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> >, cl::sycl::bundle_state) ()
from /home/michel/sycl_workspace/deploy/lib/libsycl.so.5
#3 0x00007ffff6e8e6b7 in void __gnu_cxx::new_allocator<cl::sycl::detail::kernel_bundle_impl>::construct<cl::sycl::detail::kernel_bundle_impl, cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&, cl::sycl::bundle_state&>(cl::sycl::detail::kernel_bundle_impl*, cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&, cl::sycl::bundle_state&) ()
from /home/michel/sycl_workspace/deploy/lib/libsycl.so.5
#4 0x00007ffff6e87475 in cl::sycl::detail::get_kernel_bundle_impl(cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&, cl::sycl::bundle_state) ()
from /home/michel/sycl_workspace/deploy/lib/libsycl.so.5
#5 0x0000000000407628 in cl::sycl::kernel_bundle<(cl::sycl::bundle_state)2> cl::sycl::get_kernel_bundle<(cl::sycl::bundle_state)2>(cl::sycl::context const&, std::vector<cl::sycl::device, std::allocator<cl::sycl::device> > const&) ()
#6 0x000000000040734e in cl::sycl::kernel_bundle<(cl::sycl::bundle_state)2> cl::sycl::get_kernel_bundle<(cl::sycl::bundle_state)2>(cl::sycl::context const&) ()
Environment:
- OS: RedHat Entreprise Linux 8
- Target device and vendor: all?
- DPC++ version: beb7277