Description
I'm trying to compile the stencil3d-omp benchmark of HeCBench: https://github.com/zjin-lcf/HeCBench/blob/master/src/stencil3d-omp/main.cpp
I'm using LLVM version 19.1.3 and I'm offloading to a MI100 AMD GPU.
If I compile the code with -O3 everything works and the results match with the ones from SYCL and HIP:
make CC=clang++ CFLAGS="-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908"
If I compile the code without any optimization flags the compilation is successful, however at runtime i get the following error:
AMDGPU fatal error 1: Memory access fault by GPU 8 (agent 0x55f222c19fb0) at virtual address 0x7f8c9c06a000. Reasons: Page not present or supervisor privilege Aborted (core dumped)
Lastly, if I compile the code with O0, O1 or O2 I get a Segfault at compilation: O0_compilation_output.txt
The only difference I was able to find between the O0 and O3 version is that the O0 version launches the OpenMP kernels in generic mode and the O3 in generic-SPMD mode. Could this be the reason for the crash?