Description
Describe the bug:
Getting the following errors when compiling code with an interop call to cublasSgemm using DPC++ llvm version 20230125-1321 (8c244de). The same code works with 20221115-1002(a3e93e0) . I think this bug has something to do with _CUDA_ARCH_ being defined in DCP++ causing some of intrinsic convert functions to be not defined in the cuda include files.
Compilation errors:
In file included from sycl_sgemm.cpp:31:
In file included from /usr/local/cuda/include/cublas_v2.h:65
In file included from /usr/local/cuda/include/cublas_api.h:76:
In file included from /usr/local/cuda/include/cuda_bf16.h:3745:
/usr/local/cuda/include/cuda_bf16.hpp:373:9: error: use of undeclared identifier '__float_as_uint'; did you mean '__imf_float_as_uint'?
x = __float_as_uint(f);
In file included from sycl_sgemm.cpp:31:
In file included from /usr/local/cuda/include/cublas_v2.h:65:
In file included from /usr/local/cuda/include/cublas_api.h:76:
In file included from /usr/local/cuda/include/cuda_bf16.h:3745:
/usr/local/cuda/include/cuda_bf16.hpp:397:9: error: use of undeclared identifier '__float_as_uint'; did you mean '__imf_float_as_uint'?
u = __float_as_uint(f);
In file included from sycl_sgemm.cpp:31:
In file included from /usr/local/cuda/include/cublas_v2.h:65:
In file included from /usr/local/cuda/include/cublas_api.h:76:
In file included from /usr/local/cuda/include/cuda_bf16.h:3745:
/usr/local/cuda/include/cuda_bf16.hpp:417:9: error: use of undeclared identifier '__int_as_float'; did you mean '__imf_int_as_float'?
f = __int_as_float(static_cast<int>(u));
Code to reproduce
Using the sycl_sgemm interop example code from https://github.com/codeplaysoftware/SYCL-For-CUDA-Examples/blob/master/examples/sgemm_interop/sycl_sgemm.cpp
Commands used to compile:
Command used with the 20221115-1002(a3e93e0) version that works.
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -Xsycl-target-backend --cuda-gpu-arch=sm_61 -O2 sycl_sgemm.cpp -std=c++17 -o sgemm_sycl -std=c++17 -L/usr/local/cuda/lib64 -lcublas -lcudart -lcuda
Command used with the 20230125-1321 (8c244de) version that fails.
clang++ -fsycl -fsycl-targets=nvidia_gpu_sm_61 -Xsycl-target-backend --cuda-gpu-arch=sm_61 -O2 sycl_sgemm.cpp -std=c++17 -L/usr/local/cuda/lib64 -lcublas -lcudart -lcuda -o sgemm_sycl
Environment (please complete the following information):
- OS: Ubuntu 20.04
- Target device and vendor: Nvidia GTX 1060, Nvidia A100, Nvidia H100
- DPC++ version: 8c244de
- Dependencies version: Cuda-11.7