Open
Description
Describe the bug
Level Zero adapter segfaults if older intel-level-zero-gpu is installed but not device is available.
To reproduce
Command output omitted for readability where not relevant:
$ docker run --it debian:trixie
# cd
# apt update
# apt install -y build-essential cmake git libhwloc-dev ninja-build python3 spirv-tools
# git clone --depth 1 --single-branch https://github.com/intel/llvm
# cd llvm
# python3 buildbot/configure.py
# cd build
# ninja
# LD_LIBRARY_PATH=lib bin/sycl-ls
No platforms found - run with '--verbose' to get more details.
# apt install -y wget
# wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.15770.11/intel-igc-core_1.0.15770.11_amd64.deb https://github.com/intel/compute-runtime/releases/download/23.52.28202.14/intel-level-zero-gpu_1.3.28202.14_amd64.deb
# apt install -y ./intel-level-zero-gpu_1.3.28202.14_amd64.deb ./intel-igc-core_1.0.15770.11_amd64.deb
# LD_LIBRARY_PATH=lib bin/sycl-ls
Segmentation fault (core dumped)
Environment
- OS: Linux
- Target device and vendor: N/A
- DPC++ version: 732a9ce
- Dependencies version: [e.g. the output of
sycl-ls --verbose
]
<LOADER>[INFO]: The adapter 'libur_adapter_level_zero_v2.so.0' is skipped because UR_LOADER_USE_LEVEL_ZERO_V2 or SYCL_UR_USE_LEVEL_ZERO_V2 is not set.
<LOADER>[INFO]: failed to load adapter 'libur_adapter_cuda.so.0' with error: libur_adapter_cuda.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_cuda.so.0' with error: /root/llvm/build/lib/libur_adapter_cuda.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter 'libur_adapter_hip.so.0' with error: libur_adapter_hip.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_hip.so.0' with error: /root/llvm/build/lib/libur_adapter_hip.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: loaded adapter 0x0x557496a1a560 (libur_adapter_level_zero.so.0) from lib/libur_adapter_level_zero.so.0
<LOADER>[INFO]: failed to load adapter 'libur_adapter_native_cpu.so.0' with error: libur_adapter_native_cpu.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_native_cpu.so.0' with error: /root/llvm/build/lib/libur_adapter_native_cpu.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter 'libur_adapter_offload.so.0' with error: libur_adapter_offload.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_offload.so.0' with error: /root/llvm/build/lib/libur_adapter_offload.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: loaded adapter 0x0x557496a1f130 (libur_adapter_opencl.so.0) from lib/libur_adapter_opencl.so.0
Segmentation fault (core dumped)
Additional context
libur_adapter_level_zero.so
and sycl-ls
are fine with there being no devices in principle, but as soon as libze_intel_gpu.so.1
is installed, an attempt is made to load it and then get rid of the driver when no devices are seen, but that goes wrong.