Closed
Description
Describe the bug
Linking to a static library containing SYCL kernels fails with a cryptic error when -fsycl-targets
contains several architectures, e.g. -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64
.
To Reproduce
It builds a static library libfoo.a
containing some SYCL code, and a simple main.cpp
file which calls a function from this library. Assumes clang++
and llvm-ar
point to IntelLLVM, built with CUDA support.
# If we specify one target when linking, everything works great:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda main.o libfoo.a -o main-cuda
$ make main-cuda && ./main-cuda && echo ok
ok
# Now, let's try to include two targets:
# clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda,spir64 main.o libfoo.a -o main-cuda-spir
$ make main-cuda-spir
spirv-to-ir-wrapper: Input file '!<arch>' not found
llvm-foreach:
spirv-to-ir-wrapper: Input file '/ 0 0 0 0 220 `' not found
llvm-foreach:
make: *** [Makefile:11: main-cuda-spir] Error 1
# Now let's use the same two targets in different order:
# clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda main.o libfoo.a -o main-spir-cuda
$ make main-spir-cuda
/home/aland/intel-sycl/llvm/build/install/bin/llvm-link: /tmp/libfoo-a6eb86.a:1:1: error: expected top-level entity
/tmp/libfoo-ff1c0c.o
^
/home/aland/intel-sycl/llvm/build/install/bin/llvm-link: error: loading file '/tmp/libfoo-a6eb86.a'
clang-14: error: sycl-link command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:14: main-spir-cuda] Error 1
# Not specifying any target leads to no kernels being bundled (despite the static library having them):
# clang++ -fsycl main.o libfoo.a -o main-none
$ make main-none && ./main-none && echo ok
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): Native API failed. Native API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted (core dumped)
Environment (please complete the following information):
- OS: Ubuntu Linux 20.04
- Target device and vendor: NVIDIA GTX1060SUPER (no GPU is actually needed for the problem to be observed, but that's what I have).
- DPC++ version: clang version 14.0.0 (https://github.com/intel/llvm c878063)
- Dependencies version: CUDA 11.5
Additional context