Skip to content

[CUDA] CTS usm_atomic_access_atomic64 is failed in function cuda_piQueueFinish and cuda_piextUSMFree for unspecified launch failure #5210

Closed
@yuxianch

Description

@yuxianch

Describe the bug
CTS usm_atomic_access_atomic64 is failed in function cuda_piQueueFinish and cuda_piextUSMFree for unspecified launch failure with CUDA backend on Linux.

To Reproduce

  1. Get SYCL with CUDA support (GSG) and SYCL-CTS
  2. Build CTS tests
cd SYCL-CTS
mkdir build && cd build
cmake -G "Ninja" -DSYCL_IMPLEMENTATION=Intel_SYCL -DINTEL_SYCL_ROOT=<path to built sycl> -DOpenCL_LIBRARY=<path to built sycl>/lib/libOpenCL.so -DOpenCL_INCLUDE_DIR=<path to built sycl>/include/sycl -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS_RELEASE="-fsycl -fsycl-unnamed-lambda -Wno-deprecated-declarations" -DINTEL_SYCL_FLAGS="-fsycl-device-code-split=per_kernel" -DCMAKE_EXE_LINKER_FLAGS=" -Wl,-no-relax " -DINTEL_SYCL_TRIPLE=nvptx64-nvidia-cuda -DSYCL_CTS_ENABLE_OPENCL_INTEROP_TESTS=Off -DSYCL_CTS_ENABLE_DOUBLE_TESTS=On -DSYCL_CTS_ENABLE_HALF_TESTS=On ..
ninja test_usm
  1. Run test
export SYCL_DEVICE_FILTER="CUDA:GPU"
export SYCL_ENABLE_HOST_DEVICE="1"
./bin/test_usm --device "NVIDIA.*CUDA.*NVIDIA"  --test usm_atomic_access_atomic64

Error message:

--- usm_atomic_access_atomic64

PI CUDA ERROR:
        Value:           719
        Name:            CUDA_ERROR_LAUNCH_FAILED
        Description:     unspecified launch failure
        Function:        cuda_piQueueFinish
        Source Location: /<path to built sycl>/sycl/plugins/cuda/pi_cuda.cpp:2244


PI CUDA ERROR:
        Value:           719
        Name:            CUDA_ERROR_LAUNCH_FAILED
        Description:     unspecified launch failure
        Function:        cuda_piextUSMFree
        Source Location: /<path to built sycl>/sycl/plugins/cuda/pi_cuda.cpp:4578

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)
Aborted (core dumped)

Environment

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+

Additional context
Backtrace:

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)

Thread 1 "test_usm" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7635859 in __GI_abort () at abort.c:79
#2  0x00007ffff7e5b911 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff7e6738c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff7e673f7 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x000000000040b8ab in __clang_call_terminate ()
#6  0x0000000000000004 in ?? ()
#7  0x0000000000426b04 in std::unique_ptr<long, usm_helper::allocate_usm_memory<(cl::sycl::usm::alloc)0, long>(cl::sycl::queue const&, unsigned long)::{lambda(long*)#1}>::~unique_ptr() ()
#8  0x0000000000425c98 in void usm_atomic_access::check_atomic_access<(cl::sycl::usm::alloc)0, long>(cl::sycl::queue&, sycl_cts::util::logger&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#9  0x0000000000425445 in void usm_atomic_access::run_test_with_chosen_mem_type<long>(cl::sycl::queue&, sycl_cts::util::logger&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#10 0x0000000000416938 in usm_atomic_access::run_all_tests<long>::operator()(cl::sycl::queue&, sycl_cts::util::logger&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#11 0x000000000040c4db in void for_all_types<usm_atomic_access::run_all_tests, , int, unsigned int, long, unsigned long, float, double, long long, unsigned long long, cl::sycl::queue&, sycl_cts::util::logger&, bool const&>(named_type_pack<int, unsigned int, long, unsigned long, float, double, long long, unsigned long long> const&, cl::sycl::queue&, sycl_cts::util::logger&, bool const&) ()
#12 0x000000000040bdc5 in usm_atomic_access_atomic64__::usm_atomic_access_atomic64::run(sycl_cts::util::logger&) ()
#13 0x0000000000471b24 in sycl_cts::util::test_base::run_test(sycl_cts::util::logger&) ()
#14 0x0000000000471716 in sycl_cts::util::executor::run_all() ()
#15 0x000000000044cb39 in sycl_cts::util::test_manager::run() ()
#16 0x0000000000442830 in main ()

When building the binary with "-O0 -g", it will break in generating the test_usm binary:

: && clang++ -DSYCL2020_DISABLE_DEPRECATION_WARNINGS -ffp-model=precise  -fsycl -fsycl-unnamed-lambda -O0 -g -Wno-deprecated-declarations      -Wl,-no-relax    -fsycl -sycl-std=2020 -fsycl-device-code-split=per_kernel -fsycl-targets=nvptx64-nvidia-cuda tests/usm/CMakeFiles/test_usm_objects.dir/usm_atomic_access_atomic64.cpp.o tests/common/CMakeFiles/main_function_object.dir/main.cpp.o -o bin/test_usm  -Wl,-rpath,<path to built sycl>/lib util/libutil.a oclmath/liboclmath.a <path to built sycl>/lib/libOpenCL.so && :
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.cas.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.cas.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.global.i.cta
llvm-foreach:
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 14.0.0 (https://github.com/intel/llvm.git a068b1542912618085e548a0d95f2ef2bf13a174)

Metadata

Metadata

Assignees

No one assigned

    Labels

    CTSImpacts Khronos SYCL CTSbugSomething isn't workingcompilerCompiler related issuecudaCUDA back-endruntimeRuntime library related issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions