Skip to content

[SYCL][CUDA] sycl::atomic_ref usage on CUDA backend produces linking error #5647

Closed
@krasznaa

Description

@krasznaa

Describe the bug

In the past I've made use of sycl::atomic for atomic operations in our code, which used to behave well on all backends that I tried. (Including the CUDA and HIP ones.) Now that the latest nightlies warn about sycl::atomic being deprecated, I tried to teach our code to use sycl::atomic_ref instead when it is available.

But when I do that, building the code for an NVIDIA backend fails like:

ptxas fatal   : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach: 
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).

To Reproduce

Take the following example:

#include <CL/sycl.hpp>
#include <cassert>
#include <iostream>

//#define USE_ATOMIC_REF

int main() {

  // Set up the queue.
  sycl::queue queue;
  std::cout << "Running on device: " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;

  // Set up the array.
  const std::size_t ARRAY_SIZE = 100;
  int* mem = static_cast<int*>(sycl::malloc_shared(ARRAY_SIZE * sizeof(int), queue));
  for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
    mem[i] = 0;
  }

  // Modify the array atomically on a device.
  queue.submit([mem, ARRAY_SIZE](sycl::handler& h) {
		 h.parallel_for<class atomic_test>(sycl::range<1>(ARRAY_SIZE),
						   [mem](sycl::id<1> id) {
#ifdef USE_ATOMIC_REF
						     sycl::atomic_ref<int, sycl::memory_order::relaxed,
								      sycl::memory_scope::device> aref(mem[id]);
						     aref.fetch_add(1);
#else
						     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
#endif
						   });
	       }).wait_and_throw();

  // Check the array's payload.
  for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
    assert(mem[i] == 1);
  }

  // Finish up.
  std::cout << "All OK!" << std::endl;
  sycl::free(mem, queue);
  return 0;
}

In this exact form (without USE_ATOMIC_REF being defined) it builds and runs correctly, albeit with a good number of warnings.

[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp 
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
                                                     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
                                                                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
      ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
                                     ^
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
2 warnings generated.
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
                                                     sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
                                                                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
      ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
                                       ^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
                                     ^
1 warning generated.
[bash][Legolas]:sycl > SYCL_DEVICE_FILTER=CUDA ./a.out 
Running on device: NVIDIA GeForce RTX 3080
All OK!
[bash][Legolas]:sycl >

But if I un-comment the line defining USE_ATOMIC_REF, the build fails with:

[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp 
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
 [-Wlinker-warnings]
1 warning generated.
ptxas fatal   : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach: 
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
[bash][Legolas]:sycl >

I tried a few different arguments for the memory order and memory scope template arguments, but that didn't seem to make much of a difference.

Environment (please complete the following information)

  • OS: Ubuntu 20.04
  • Target device and vendor: NVIDIA GPU
  • DPC++ version:
[bash][Legolas]:sycl > clang++ -v
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
Found CUDA installation: /home/krasznaa/software/cuda/11.5.2/x86_64-ubuntu2004, version 11.5
Found HIP installation: /opt/rocm, version 4.2.21155-37cb3a34
[bash][Legolas]:sycl >
  • Dependencies version: N/A

Pinging @fwyzard and @ivorobts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcompilerCompiler related issuecudaCUDA back-end

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions