Skip to content

[Clang][AMDGPU] Add builtins for instrinsic llvm.amdgcn.raw.ptr.buffer.store #94576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2024

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Jun 6, 2024

Depends on #96313.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jun 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2024

@llvm/pr-subscribers-clang-modules
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Shilei Tian (shiltian)

Changes

Patch is 25.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94576.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+14)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+23)
  • (added) clang/test/CodeGenOpenCL/builtins-amdgcn-raw-buffer-store.cl (+264)
  • (added) clang/test/SemaOpenCL/builtins-amdgcn-raw-buffer-store-error.cl (+94)
diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index 433c7795325f0..d6866304f8b1b 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -148,6 +148,20 @@ BUILTIN(__builtin_amdgcn_qsad_pk_u16_u8, "WUiWUiUiWUi", "nc")
 BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc")
 BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
 
+BUILTIN(__builtin_amdgcn_raw_buffer_store_i8, "vcV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_i16, "vsV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_i32, "viV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_f16, "vhV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_f32, "vfV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v2i16, "vV2sV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v2i32, "vV2iV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v2f16, "vV2hV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v2f32, "vV2fV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v4i16, "vV4sV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v4i32, "vV4iV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v4f16, "vV4hV4iIiIiIi", "n")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_v4f32, "vV4fV4iIiIiIi", "n")
+
 //===----------------------------------------------------------------------===//
 // Ballot builtins.
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 37d0c478e0330..706141cbc85ce 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -19063,6 +19063,29 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
         CGM.getIntrinsic(Intrinsic::amdgcn_s_sendmsg_rtn, {ResultType});
     return Builder.CreateCall(F, {Arg});
   }
+
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_i8:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_i16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_i32:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_f32:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_f16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v2i16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v2i32:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v2f16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v2f32:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v4i16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v4i32:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v4f16:
+  case AMDGPU::BI__builtin_amdgcn_raw_buffer_store_v4f32: {
+    llvm::Value *VData = EmitScalarExpr(E->getArg(0));
+    llvm::Value *Rsrc = EmitScalarExpr(E->getArg(1));
+    llvm::Value *Offset = EmitScalarExpr(E->getArg(2));
+    llvm::Value *SOffset = EmitScalarExpr(E->getArg(3));
+    llvm::Value *Aux = EmitScalarExpr(E->getArg(4));
+    Function *F =
+        CGM.getIntrinsic(Intrinsic::amdgcn_raw_buffer_store, VData->getType());
+    return Builder.CreateCall(F, {VData, Rsrc, Offset, SOffset, Aux});
+  }
   default:
     return nullptr;
   }
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-raw-buffer-store.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-raw-buffer-store.cl
new file mode 100644
index 0000000000000..0553c53ab41ad
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-raw-buffer-store.cl
@@ -0,0 +1,264 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s --check-prefixes=VERDE
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tonga -emit-llvm -o - %s | FileCheck %s --check-prefixes=GFX8
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1100 -emit-llvm -o - %s | FileCheck %s --check-prefixes=GFX11
+
+#pragma OPENCL EXTENSION cl_khr_fp16 : enable
+
+typedef short v2i16 __attribute__((ext_vector_type(2)));
+typedef int v2i32 __attribute__((ext_vector_type(2)));
+typedef half v2f16 __attribute__((ext_vector_type(2)));
+typedef float v2f32 __attribute__((ext_vector_type(2)));
+typedef short v4i16 __attribute__((ext_vector_type(4)));
+typedef int v4i32 __attribute__((ext_vector_type(4)));
+typedef half v4f16 __attribute__((ext_vector_type(4)));
+typedef float v4f32 __attribute__((ext_vector_type(4)));
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_i8(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i8(i8 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_i8(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i8(i8 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_i8(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i8(i8 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_i8(char vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_i8(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_i16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i16(i16 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_i16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i16(i16 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_i16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i16(i16 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_i16(short vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_i16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_i32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i32(i32 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_i32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i32(i32 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_i32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.i32(i32 [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_i32(int vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_i32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_f16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f16(half [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_f16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f16(half [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_f16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f16(half [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_f16(half vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_f16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_f32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f32(float [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_f32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f32(float [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_f32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.f32(float [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_f32(float vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_f32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v2i16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i16(<2 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v2i16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i16(<2 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v2i16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i16(<2 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v2i16(v2i16 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v2i16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v2i32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i32(<2 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v2i32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i32(<2 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v2i32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2i32(<2 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v2i32(v2i32 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v2i32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v2f16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f16(<2 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v2f16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f16(<2 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v2f16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f16(<2 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v2f16(v2f16 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v2f16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v2f32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f32(<2 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v2f32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f32(<2 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v2f32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v2f32(<2 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v2f32(v2f32 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v2f32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v4i16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i16(<4 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v4i16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i16(<4 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v4i16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i16(<4 x i16> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v4i16(v4i16 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v4i16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v4i32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i32(<4 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v4i32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i32(<4 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v4i32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4i32(<4 x i32> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v4i32(v4i32 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v4i32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v4f16(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f16(<4 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v4f16(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f16(<4 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v4f16(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f16(<4 x half> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v4f16(v4f16 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v4f16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
+// VERDE-LABEL: @test_amdgcn_raw_buffer_store_v4f32(
+// VERDE-NEXT:  entry:
+// VERDE-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// VERDE-NEXT:    ret void
+//
+// GFX8-LABEL: @test_amdgcn_raw_buffer_store_v4f32(
+// GFX8-NEXT:  entry:
+// GFX8-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX8-NEXT:    ret void
+//
+// GFX11-LABEL: @test_amdgcn_raw_buffer_store_v4f32(
+// GFX11-NEXT:  entry:
+// GFX11-NEXT:    tail call void @llvm.amdgcn.raw.buffer.store.v4f32(<4 x float> [[VDATA:%.*]], <4 x i32> [[RSRC:%.*]], i32 0, i32 0, i32 0)
+// GFX11-NEXT:    ret void
+//
+void test_amdgcn_raw_buffer_store_v4f32(v4f32 vdata, v4i32 rsrc) {
+  __builtin_amdgcn_raw_buffer_store_v4f32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, /*aux=*/0);
+}
+
diff --git a/clang/test/SemaOpenCL/builtins-amdgcn-raw-buffer-store-error.cl b/clang/test/SemaOpenCL/builtins-amdgcn-raw-buffer-store-error.cl
new file mode 100644
index 0000000000000..33e20d6dc771a
--- /dev/null
+++ b/clang/test/SemaOpenCL/builtins-amdgcn-raw-buffer-store-error.cl
@@ -0,0 +1,94 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -S -verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tonga -S -verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu gfx1100 -S -verify -o - %s
+// REQUIRES: amdgpu-registered-target
+
+#pragma OPENCL EXTENSION cl_khr_fp16 : enable
+
+typedef short v2i16 __attribute__((ext_vector_type(2)));
+typedef int v2i32 __attribute__((ext_vector_type(2)));
+typedef half v2f16 __attribute__((ext_vector_type(2)));
+typedef float v2f32 __attribute__((ext_vector_type(2)));
+typedef short v4i16 __attribute__((ext_vector_type(4)));
+typedef int v4i32 __attribute__((ext_vector_type(4)));
+typedef half v4f16 __attribute__((ext_vector_type(4)));
+typedef float v4f32 __attribute__((ext_vector_type(4)));
+
+void test_amdgcn_raw_buffer_store_i8(char vdata, v4i32 rsrc, int offset, int soffset, int aux) {
+  __builtin_amdgcn_raw_buffer_store_i8(vdata, rsrc, offset, /*soffset=*/0, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i8' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i8(vdata, rsrc, /*offset=*/0, soffset, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i8' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i8(vdata, rsrc, /*offset=*/0, /*soffset=*/0, aux); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i8' must be a constant integer}}
+}
+
+void test_amdgcn_raw_buffer_store_i16(short vdata, v4i32 rsrc, int offset, int soffset, int aux) {
+  __builtin_amdgcn_raw_buffer_store_i16(vdata, rsrc, offset, /*soffset=*/0, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i16' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i16(vdata, rsrc, /*offset=*/0, soffset, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i16' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, aux); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i16' must be a constant integer}}
+}
+
+void test_amdgcn_raw_buffer_store_i32(int vdata, v4i32 rsrc, int offset, int soffset, int aux) {
+  __builtin_amdgcn_raw_buffer_store_i32(vdata, rsrc, offset, /*soffset=*/0, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i32' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i32(vdata, rsrc, /*offset=*/0, soffset, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i32' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_i32(vdata, rsrc, /*offset=*/0, /*soffset=*/0, aux); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_i32' must be a constant integer}}
+}
+
+void test_amdgcn_raw_buffer_store_f16(half vdata, v4i32 rsrc, int offset, int soffset, int aux) {
+  __builtin_amdgcn_raw_buffer_store_f16(vdata, rsrc, offset, /*soffset=*/0, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_f16' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_f16(vdata, rsrc, /*offset=*/0, soffset, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_f16' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_f16(vdata, rsrc, /*offset=*/0, /*soffset=*/0, aux); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_f16' must be a constant integer}}
+}
+
+void test_amdgcn_raw_buffer_store_f32(float vdata, v4i32 rsrc, int offset, int soffset, int aux) {
+  __builtin_amdgcn_raw_buffer_store_f32(vdata, rsrc, offset, /*soffset=*/0, /*aux=*/0); //expected-error{{argument to '__builtin_amdgcn_raw_buffer_store_f32' must be a constant integer}}
+  __builtin_amdgcn_raw_buffer_store_f32(vdata, rsrc, /*offset=*/0, soffset, /*aux=*/0); //expected-error{{argument to '__builtin_amdg...
[truncated]

@shiltian shiltian requested review from arsenm and yxsamliu June 6, 2024 05:47
@jayfoad
Copy link
Contributor

jayfoad commented Jun 6, 2024

Is there really a good use case for this? Can you use regular stores to addrspace(7) instead? @krzysz00

Also, do you really need a separate builtin for every legal type, or is there some way they can be type-overloaded?

@arsenm
Copy link
Contributor

arsenm commented Jun 6, 2024

Is there really a good use case for this? Can you use regular stores to addrspace(7) instead? @krzysz00

I see these regularly used via inline asm in various ML code. We need to expose these in some way to stop people from doing that

Also, do you really need a separate builtin for every legal type, or is there some way they can be type-overloaded?

Yes, I imagined we would handle images similar to the elementwise intrinsics. However, I don't think that approach works for loads. If we have to have overloads for loads, we probably should mirror it for stores.

I think it makes more sense to solve the issue for the load case before the stores. They're a bit more complicated because you have the sign vs. zero extended cases to consider, and the overload would be on the return type

@shiltian shiltian force-pushed the buffer-load branch 4 times, most recently from 6f7374e to d520ea7 Compare June 6, 2024 13:49
@krzysz00
Copy link
Contributor

krzysz00 commented Jun 6, 2024

Re addrspace 7, there's one major piece of work missing: arbitrary-typed inputs.
That is, we can't currently handle, for example, load <16 x i8>, ptr addrspace(7) %p (or, worse, load i256, ptr addrspace(7) %p.
That's been a followup ticket I never have time to do.

If we do want addrspace(7), we'll need to expose make.buffer.rsrc and give it a p7 variant probably.

And even so, we'll want addrspace(8) to work because there are non-raw buffers (ex. swizzling).

@arsenm
Copy link
Contributor

arsenm commented Jun 6, 2024

If we do want addrspace(7), we'll need to expose make.buffer.rsrc and give it a p7 variant probably.

Yes.

We probably should expose some kind of custom type instead of directly using a C address_space(7) attribute

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

Actually, even ignoring address space 7, it feels like these builtins if you could raw.ptr.buffer.store any type you liked, and then they could be type-varying in Clang?

@arsenm
Copy link
Contributor

arsenm commented Jun 7, 2024

Actually, even ignoring address space 7, it feels like these builtins if you could raw.ptr.buffer.store any type you liked, and then they could be type-varying in Clang?

We could either have a builtin for all the types that would work, or if we want to treat them more like a normal pointer, clang could verify you only use them with types that will work.

Although I just realized, we have to have the builtins because of the separate swizzled and non-swizzled operands in the intrinsic

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

  1. For the swizzled case, that's struct.ptr.buffer.*, and yeah, those will always need builtins because LLVM can't deal in 2D addressing schemes
  2. What I mean is that "types that work" isn't the right framing: any type can be legalized to one or more types that work. That is, down in the isel legalizer, if I call for, for example
    %0 = call {i64, i64, i8} @llvm.amdgcn.raw.buffer.ptr.load(ptr addrspace(8) %rsrc, i32 %off, ...)

that should become two BUFFER_LOAD instructions, one that loads a <4 x i32> and one that loads an i8 from %off + 16

@arsenm
Copy link
Contributor

arsenm commented Jun 7, 2024

  1. For the swizzled case, that's struct.ptr.buffer.*, and yeah, those will always need builtins because LLVM can't deal in 2D addressing schemes

But the raw buffer intrinsics have both the soffset and voffset parameters though? Not just the struct

@arsenm
Copy link
Contributor

arsenm commented Jun 7, 2024

  1. What I mean is that "types that work" isn't the right framing: any type can be legalized to one or more types that work. That is, down in the isel legalizer, if I call for, for example
    %0 = call {i64, i64, i8} @llvm.amdgcn.raw.buffer.ptr.load(ptr addrspace(8) %rsrc, i32 %off, ...)

Handling arbitrary aggregates here isn't really reasonable or necessary. We can restrict this to a reasonable set of legal-ish types

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

voffset and soffset are "offset that goes in VGPRs" and "offset that goes in SGPRs", with the latter having some different bounds-checking semantics on ... at least some of the gfx9's, IIRC.

The address space 7 lowering just uses voffset.

Re arbitrary aggregates: LLPC has code for handling this (since I'm pretty sure stuff like Vulkan wants it, and "aggregates" here might even be unusual cases like <4 x i8> instead of i32) so It's been on the roadmap for some time.

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

raw.ptr.buffer.load (and .store) are loads and stores and should be able to deal with any type you could send through a normal pointer (especially since a partially-OOB read is already hardware-level UB, so extending that through the intrinsics is reasonable)

struct.ptr.* do have stronger legality constraints because of swizzling, but I'm not proposing we generalize those

@arsenm
Copy link
Contributor

arsenm commented Jun 7, 2024

voffset and soffset are "offset that goes in VGPRs" and "offset that goes in SGPRs", with the latter having some different bounds-checking semantics on ... at least some of the gfx9's, IIRC.

Right, that's the problem. We need to know the parameters of the SRD in order to make use of the scalar offset. Ideally we would have one pointer operand and be able to addressing mode match into soffset/voffset/imm

@arsenm
Copy link
Contributor

arsenm commented Jun 7, 2024

"aggregates" here might even be unusual cases like <4 x i8>

Vectors aren't aggregates and are more reasonable

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

The thing is, in all the usecases I've seen, soffsset == 0, and so you can legalize on voffset (voffset is also what the constant offsets on an instruction get added to)

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

(My guesses for how I might use sofffset is if I've got multiple identical buffers concatentated and I need to pick between them without messing with the extent field)

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 7, 2024

(The ugly version of the arbitrary types code lives around https://github.com/GPUOpen-Drivers/llpc/blob/6c770c7d276d2c2504aed2a0278aab1610993ecf/lgc/patch/PatchBufferOp.cpp#L1559 and really should be an isel legalization instead)

@shiltian
Copy link
Contributor Author

shiltian commented Jun 8, 2024

#94830 for buffer rsrc data type. Will update this patch afterwards.

@llvmbot llvmbot added clang:modules C++20 modules and Clang Header Modules debuginfo labels Jun 18, 2024
@shiltian shiltian force-pushed the buffer-load branch 2 times, most recently from bc3165d to 450572d Compare June 19, 2024 03:35
@shiltian
Copy link
Contributor Author

ping

@yxsamliu
Copy link
Collaborator

maybe add a test for non-constant offset?

@shiltian
Copy link
Contributor Author

maybe add a test for non-constant offset?

Hmm, I thought I added all tests...My bad. Will add it right away.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should really have all the different typed variants, and if this should be the name. I guess

@shiltian shiltian force-pushed the buffer-load branch 2 times, most recently from 48d095d to e4a7703 Compare June 21, 2024 15:43
@shiltian shiltian changed the title [Clang][AMDGPU] Add builtins for instrinsic llvm.amdgcn.raw.buffer.store [Clang][AMDGPU] Add builtins for instrinsic llvm.amdgcn.raw.ptr.buffer.store Jun 21, 2024
@shiltian
Copy link
Contributor Author

ping

@shiltian shiltian merged commit c9f083a into llvm:main Jun 25, 2024
7 checks passed
@shiltian shiltian deleted the buffer-load branch June 25, 2024 13:55
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 25, 2024

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building clang at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/594

Here is the relevant piece of the build log for the reference:

Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[37/38] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
[38/38] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -o External/HIP/TheNextWeek-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
+ ninja -v check-hip-simple
@@@BUILD_STEP Testing HIP test-suite@@@
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 6 tests, 6 workers --
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (5 of 6)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 50.85s

Total Discovered Tests: 6
  Passed: 5 (83.33%)
  Failed: 1 (16.67%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 6 tests, 6 workers --
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (5 of 6)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 50.85s

Total Discovered Tests: 6
  Passed: 5 (83.33%)
  Failed: 1 (16.67%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
program finished with exit code 1
elapsedTime=215.989391

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category debuginfo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants