[Offload] Offload to NVIDIA GPUs fails with CUDA 13.0 or newer (LLVM 20.1.8)

I'm still trying to figure out what goes wrong exactly, but maybe someone has an idea.

With LLVM 20.1.8, built with a bootstrapped build with EasyBuild, building any program using OpenMP offload via e.g. `-fopenmp --offload-arch=sm_75` fails when running the application with:

```
omptarget error: Consult https://openmp.llvm.org/design/Runtimes.html for debugging options.
omptarget error: No images found compatible with the installed hardware. [1]    14988 segmentation fault (core dumped)  OMP_TARGET_OFFLOAD=mandatory ./zaxpy
```

Looking closer with GDB, the stack trace looks like this:

```gdb
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
omptarget error: Consult https://openmp.llvm.org/design/Runtimes.html for debugging options.
omptarget error: No images found compatible with the installed hardware. 
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5286ccf in llvm::object::ELFObjectFileBase::getNVPTXCPUName() const () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/libLLVM.so.20.1
(gdb) bt
#0  0x00007ffff5286ccf in llvm::object::ELFObjectFileBase::getNVPTXCPUName() const () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/libLLVM.so.20.1
#1  0x00007ffff5286c53 in llvm::object::ELFObjectFileBase::tryGetCPUName() const () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/libLLVM.so.20.1
#2  0x00007ffff7a9cca1 in handleTargetOutcome(bool, ident_t*) () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/x86_64-unknown-linux-gnu/libomptarget.so.20.1
#3  0x00007ffff7a97f43 in checkDevice(long&, ident_t*) () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/x86_64-unknown-linux-gnu/libomptarget.so.20.1
#4  0x00007ffff7a984e0 in void targetData<AsyncInfoTy>(ident_t*, long, int, void**, void**, long*, long*, void**, void**, int (*)(ident_t*, DeviceTy&, int, void**, void**, long*, long*, void**, void**, AsyncInfoTy&, bool), char const*, char const*) ()
   from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/x86_64-unknown-linux-gnu/libomptarget.so.20.1
#5  0x00007ffff7a980c4 in __tgt_target_data_begin_mapper () from /opt/EasyBuild/apps/software/LLVM/20.1.8-GCCcore-14.3.0/lib/x86_64-unknown-linux-gnu/libomptarget.so.20.1
#6  0x000055555555ae7f in main ()
```

Testing CUDA 12.9.1 or earlier, everything looks okay. It seems to only affect CUDA 13.0.0 and 13.0.1 so far.
I haven't tried LLVM 21.1.1 yet, mostly due to the only machine I'm able to test this with taking quite long to build LLVM with.

Its also worth noting that one can build the application with an older CUDA and then run with the newer one. The other way around also fails. Maybe some changes in between these major version causes issues. There were the announced [ELF visibility and linkage changes](https://developer.nvidia.com/blog/cuda-c-compiler-updates-impacting-elf-visibility-and-linkage/), but as far as I understand, this only affects `nvcc`. The driver itself should be recent enough (580.65.06).

LLVM 20.1.8 is particularly interesting because of e.g. Numba supporting that particular version soon, while CUDA 13 is interesting for better support of recent GPUs. Mixing LLVM versions would be a noticeable inconvenience. 

I'll now try to get a version of LLVM 21 built for cross-checking. Maybe the issue is already resolved and I just haven't found the correct PR for that yet.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Offload] Offload to NVIDIA GPUs fails with CUDA 13.0 or newer (LLVM 20.1.8) #159088

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Offload] Offload to NVIDIA GPUs fails with CUDA 13.0 or newer (LLVM 20.1.8) #159088

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions