Skip to content

Level Zero adapter segfaults if older intel-level-zero-gpu is installed but not device is available #18746

Open
@hvdijk

Description

@hvdijk

Describe the bug

Level Zero adapter segfaults if older intel-level-zero-gpu is installed but not device is available.

To reproduce

Command output omitted for readability where not relevant:

$ docker run --it debian:trixie
# cd
# apt update
# apt install -y build-essential cmake git libhwloc-dev ninja-build python3 spirv-tools
# git clone --depth 1 --single-branch https://github.com/intel/llvm
# cd llvm
# python3 buildbot/configure.py
# cd build
# ninja
# LD_LIBRARY_PATH=lib bin/sycl-ls
No platforms found - run with '--verbose' to get more details.
# apt install -y wget
# wget https://github.com/intel/intel-graphics-compiler/releases/download/igc-1.0.15770.11/intel-igc-core_1.0.15770.11_amd64.deb https://github.com/intel/compute-runtime/releases/download/23.52.28202.14/intel-level-zero-gpu_1.3.28202.14_amd64.deb
# apt install -y ./intel-level-zero-gpu_1.3.28202.14_amd64.deb ./intel-igc-core_1.0.15770.11_amd64.deb
# LD_LIBRARY_PATH=lib bin/sycl-ls
Segmentation fault (core dumped)

Environment

  • OS: Linux
  • Target device and vendor: N/A
  • DPC++ version: 732a9ce
  • Dependencies version: [e.g. the output of sycl-ls --verbose]
<LOADER>[INFO]: The adapter 'libur_adapter_level_zero_v2.so.0' is skipped because UR_LOADER_USE_LEVEL_ZERO_V2 or SYCL_UR_USE_LEVEL_ZERO_V2 is not set.
<LOADER>[INFO]: failed to load adapter 'libur_adapter_cuda.so.0' with error: libur_adapter_cuda.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_cuda.so.0' with error: /root/llvm/build/lib/libur_adapter_cuda.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter 'libur_adapter_hip.so.0' with error: libur_adapter_hip.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_hip.so.0' with error: /root/llvm/build/lib/libur_adapter_hip.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: loaded adapter 0x0x557496a1a560 (libur_adapter_level_zero.so.0) from lib/libur_adapter_level_zero.so.0
<LOADER>[INFO]: failed to load adapter 'libur_adapter_native_cpu.so.0' with error: libur_adapter_native_cpu.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_native_cpu.so.0' with error: /root/llvm/build/lib/libur_adapter_native_cpu.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter 'libur_adapter_offload.so.0' with error: libur_adapter_offload.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: failed to load adapter '/root/llvm/build/lib/libur_adapter_offload.so.0' with error: /root/llvm/build/lib/libur_adapter_offload.so.0: cannot open shared object file: No such file or directory
<LOADER>[INFO]: loaded adapter 0x0x557496a1f130 (libur_adapter_opencl.so.0) from lib/libur_adapter_opencl.so.0
Segmentation fault (core dumped)

Additional context

libur_adapter_level_zero.so and sycl-ls are fine with there being no devices in principle, but as soon as libze_intel_gpu.so.1 is installed, an attempt is made to load it and then get rid of the driver when no devices are seen, but that goes wrong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglevel-zeroIssues related to the Level Zero backendunified-runtime

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions