[SYCL][CUDA] Fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta 

**Describe the bug**
After merging https://github.com/intel/llvm/pull/4820 compilation fails with error "Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta".

```
fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta
llvm-foreach:
clang++: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 14.0.0 (https://github.com/intel/llvm.git c855fd177a88d9fc6db87bc2a6c8001f0398ec9a)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: .../build/bin
clang++: note: diagnostic msg: Error generating preprocessed source(s).
```

**To Reproduce**

Let's compile the following example for the CUDA backend

```c++
#include <sycl/sycl.hpp>

int main() {
  constexpr static size_t buffer_size = 4;
  sycl::queue stream;
  sycl::buffer<int, 1> buffer(buffer_size);
  stream.submit(
    [&](sycl::handler &cgh) {
        sycl::accessor<int, 1, sycl::access_mode::read_write, sycl::access::target::local> acc(sycl::range<1>(buffer_size), cgh);
        auto acc_write = buffer.get_access<sycl::access::mode::write>(cgh);

        cgh.parallel_for<class demo_kernel>(buffer_size, [=](sycl::id<1> item) {
          sycl::atomic<int, sycl::access::address_space::local_space>(sycl::local_ptr<int>(acc.get_pointer())).fetch_add(1);         
        });
    });

  return 0;
}
```

using the following command:

```
clang++ -fsycl -fsycl-unnamed-lambda -fsycl-targets=nvptx64-nvidia-cuda -O2 example.cpp -o example  -lOpenCL
```

or with `-nocudalib` parameter, it's enough to see the error:

```
clang++ -fsycl -fsycl-unnamed-lambda -fsycl-targets=nvptx64-nvidia-cuda -O2 example.cpp -o example  -lOpenCL -nocudalib
```

The described above error (`fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta`) occurs while the successful compilation is expected.

**Environment (please complete the following information):**

- OS: Ubuntu 20.04
- Target device and vendor: GPU: TITAN RTX
- DPC++ version:
https://github.com/intel/llvm/commit/c855fd177a88d9fc6db87bc2a6c8001f0398ec9a
- Dependencies version: 
NVIDIA-SMI 470.57.02
CUDA 11.4

**Additional context**

**Potential guilty commit**
commit 2ebde5f6011ea48f118219ae6fa8811f94e7acc7
Author: Tadej Ciglarič <tadej.ciglaric@codeplay.com>
Date:   Thu Nov 18 05:45:46 2021 +0100

[SYCL][CUDA][libclc] Added atomics with scopes and memory orders (#4820)

Added libclc implementations for CUDA atomics, including for various scopes and memory orders. They are implemented using LLVM intrinsics and exposed as clang builtins, which are than used to implement functions in libclc.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][CUDA] Fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta #5008

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[SYCL][CUDA] Fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta #5008

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions