Skip to content

[DirectX] Support the CBufferLoadLegacy operation #128699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 26, 2025

Conversation

bogner
Copy link
Contributor

@bogner bogner commented Feb 25, 2025

Fixes #112992

@llvmbot
Copy link
Member

llvmbot commented Feb 25, 2025

@llvm/pr-subscribers-tablegen

@llvm/pr-subscribers-backend-directx

Author: Justin Bogner (bogner)

Changes

Fixes #112992


Full diff: https://github.com/llvm/llvm-project/pull/128699.diff

9 Files Affected:

  • (modified) llvm/docs/DirectX/DXILResources.rst (+120-6)
  • (modified) llvm/include/llvm/IR/IntrinsicsDirectX.td (+15)
  • (modified) llvm/lib/Target/DirectX/DXIL.td (+19)
  • (modified) llvm/lib/Target/DirectX/DXILOpBuilder.cpp (+40)
  • (modified) llvm/lib/Target/DirectX/DXILOpBuilder.h (+3)
  • (modified) llvm/lib/Target/DirectX/DXILOpLowering.cpp (+31)
  • (added) llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll (+45)
  • (added) llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll (+63)
  • (modified) llvm/utils/TableGen/DXILEmitter.cpp (+7-1)
diff --git a/llvm/docs/DirectX/DXILResources.rst b/llvm/docs/DirectX/DXILResources.rst
index 80e3c2c11153d..91dcd5c8d5214 100644
--- a/llvm/docs/DirectX/DXILResources.rst
+++ b/llvm/docs/DirectX/DXILResources.rst
@@ -277,7 +277,7 @@ Examples:
 Accessing Resources as Memory
 -----------------------------
 
-*relevant types: Buffers, CBuffer, and Textures*
+*relevant types: Buffers and Textures*
 
 Loading and storing from resources is generally represented in LLVM using
 operations on memory that is only accessible via a handle object. Given a
@@ -321,12 +321,11 @@ Examples:
 Loads, Samples, and Gathers
 ---------------------------
 
-*relevant types: Buffers, CBuffers, and Textures*
+*relevant types: Buffers and Textures*
 
-All load, sample, and gather operations in DXIL return a `ResRet`_ type, and
-CBuffer loads return a similar `CBufRet`_ type. These types are structs
-containing 4 elements of some basic type, and in the case of `ResRet` a 5th
-element that is used by the `CheckAccessFullyMapped`_ operation. Some of these
+All load, sample, and gather operations in DXIL return a `ResRet`_ type. These
+types are structs containing 4 elements of some basic type, and a 5th element
+that is used by the `CheckAccessFullyMapped`_ operation. Some of these
 operations, like `RawBufferLoad`_ include a mask and/or alignment that tell us
 some information about how to interpret those four values.
 
@@ -632,3 +631,118 @@ Examples:
        target("dx.RawBuffer", i8, 1, 0, 0) %buffer,
        i32 %index, i32 0, <4 x double> %data)
 
+Constant Buffer Loads
+---------------------
+
+*relevant types: CBuffers*
+
+The `CBufferLoadLegacy`_ operation, which despite the name is the only
+supported way to load from a cbuffer in any DXIL version, loads a single "row"
+of a cbuffer, which is exactly 16 bytes. The return value of the operation is
+represented by a `CBufRet`_ type, which has variants for 2 64-bit values, 4
+32-bit values, and 8 16-bit values.
+
+We represent these in LLVM IR with 3 separate operations, which return a
+2-element, 4-element, or 8-element struct respectively.
+
+.. _CBufferLoadLegacy: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferLoadLegacy
+.. _CBufRet: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferloadlegacy
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.4``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 4 32-bit values
+     - A single row of a cbuffer, interpreted as 4 32-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {float, float, float, float}
+       @llvm.dx.resource.load.cbufferrow.4(
+           target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+           i32 %index)
+   %ret = call {i32, i32, i32, i32}
+       @llvm.dx.resource.load.cbufferrow.4(
+           target("dx.CBuffer", target("dx.Layout", {i32}, 4, 0)) %buffer,
+           i32 %index)
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.2``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 2 64-bit values
+     - A single row of a cbuffer, interpreted as 2 64-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {double, double}
+       @llvm.dx.resource.load.cbufferrow.2(
+           target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
+           i32 %index)
+   %ret = call {i64, i64}
+       @llvm.dx.resource.load.cbufferrow.2(
+           target("dx.CBuffer", target("dx.Layout", {i64}, 4, 0)) %buffer,
+           i32 %index)
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.8``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 8 16-bit values
+     - A single row of a cbuffer, interpreted as 8 16-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {half, half, half, half, half, half, half, half}
+       @llvm.dx.resource.load.cbufferrow.8(
+           target("dx.CBuffer", target("dx.Layout", {half}, 2, 0)) %buffer,
+           i32 %index)
+   %ret = call {i16, i16, i16, i16, i16, i16, i16, i16}
+       @llvm.dx.resource.load.cbufferrow.8(
+           target("dx.CBuffer", target("dx.Layout", {i16}, 2, 0)) %buffer,
+           i32 %index)
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index beed84b144cec..87de68cb3ad4f 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -45,6 +45,21 @@ def int_dx_resource_store_rawbuffer
           [], [llvm_any_ty, llvm_i32_ty, llvm_i32_ty, llvm_any_ty],
           [IntrWriteMem]>;
 
+// dx.resource.load.cbufferrow encodes the number of elements returned in the
+// function name. The total size of the return should always be 128 bits.
+def int_dx_resource_load_cbufferrow_8
+    : DefaultAttrsIntrinsic<
+          [llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty,
+           llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
+          [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+def int_dx_resource_load_cbufferrow_4
+    : DefaultAttrsIntrinsic<
+          [llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
+          [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+def int_dx_resource_load_cbufferrow_2
+    : DefaultAttrsIntrinsic<[llvm_any_ty, llvm_any_ty],
+                            [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+
 def int_dx_resource_updatecounter
     : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_any_ty, llvm_i8_ty],
                             [IntrInaccessibleMemOrArgMemOnly]>;
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index d59e28c37b91d..74fd62a028b09 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -46,6 +46,12 @@ def ResRetDoubleTy : DXILOpParamType;
 def ResRetInt16Ty : DXILOpParamType;
 def ResRetInt32Ty : DXILOpParamType;
 def ResRetInt64Ty : DXILOpParamType;
+def CBufRetHalfTy : DXILOpParamType;
+def CBufRetFloatTy : DXILOpParamType;
+def CBufRetDoubleTy : DXILOpParamType;
+def CBufRetInt16Ty : DXILOpParamType;
+def CBufRetInt32Ty : DXILOpParamType;
+def CBufRetInt64Ty : DXILOpParamType;
 def HandleTy : DXILOpParamType;
 def ResBindTy : DXILOpParamType;
 def ResPropsTy : DXILOpParamType;
@@ -816,6 +822,19 @@ def CreateHandle : DXILOp<57, createHandle> {
   let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
 }
 
+def CBufferLoadLegacy : DXILOp<59, cbufferLoadLegacy> {
+  let Doc = "reads from a TypedBuffer";
+  // Handle, Index
+  let arguments = [HandleTy, Int32Ty];
+  let result = OverloadTy;
+  let overloads = [Overloads<DXIL1_0, [
+    CBufRetHalfTy, CBufRetFloatTy, CBufRetDoubleTy, CBufRetInt16Ty,
+    CBufRetInt32Ty, CBufRetInt64Ty
+  ]>];
+  let stages = [Stages<DXIL1_0, [all_stages]>];
+  let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
+}
+
 def BufferLoad : DXILOp<68, bufferLoad> {
   let Doc = "reads from a TypedBuffer";
   // Handle, Coord0, Coord1
diff --git a/llvm/lib/Target/DirectX/DXILOpBuilder.cpp b/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
index 6bbe8d5d12280..f45f86f60100d 100644
--- a/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
@@ -201,6 +201,30 @@ static StructType *getResRetType(Type *ElementTy) {
   return getOrCreateStructType(TypeName, FieldTypes, Ctx);
 }
 
+static StructType *getCBufRetType(Type *ElementTy) {
+  LLVMContext &Ctx = ElementTy->getContext();
+  OverloadKind Kind = getOverloadKind(ElementTy);
+  std::string TypeName = constructOverloadTypeName(Kind, "dx.types.CBufRet.");
+
+  // 64-bit types only have two elements
+  if (ElementTy->isDoubleTy() || ElementTy->isIntegerTy(64))
+    return getOrCreateStructType(
+        TypeName, {ElementTy, ElementTy}, Ctx);
+
+  // 16-bit types pack 8 elements and have .8 in their name to differentiate
+  // from min-precision types.
+  if (ElementTy->isHalfTy() || ElementTy->isIntegerTy(16)) {
+    TypeName += ".8";
+    return getOrCreateStructType(TypeName,
+                                 {ElementTy, ElementTy, ElementTy, ElementTy,
+                                  ElementTy, ElementTy, ElementTy, ElementTy},
+                                 Ctx);
+  }
+
+  return getOrCreateStructType(
+      TypeName, {ElementTy, ElementTy, ElementTy, ElementTy}, Ctx);
+}
+
 static StructType *getHandleType(LLVMContext &Ctx) {
   return getOrCreateStructType("dx.types.Handle", PointerType::getUnqual(Ctx),
                                Ctx);
@@ -265,6 +289,18 @@ static Type *getTypeFromOpParamType(OpParamType Kind, LLVMContext &Ctx,
     return getResRetType(Type::getInt32Ty(Ctx));
   case OpParamType::ResRetInt64Ty:
     return getResRetType(Type::getInt64Ty(Ctx));
+  case OpParamType::CBufRetHalfTy:
+    return getCBufRetType(Type::getHalfTy(Ctx));
+  case OpParamType::CBufRetFloatTy:
+    return getCBufRetType(Type::getFloatTy(Ctx));
+  case OpParamType::CBufRetDoubleTy:
+    return getCBufRetType(Type::getDoubleTy(Ctx));
+  case OpParamType::CBufRetInt16Ty:
+    return getCBufRetType(Type::getInt16Ty(Ctx));
+  case OpParamType::CBufRetInt32Ty:
+    return getCBufRetType(Type::getInt32Ty(Ctx));
+  case OpParamType::CBufRetInt64Ty:
+    return getCBufRetType(Type::getInt64Ty(Ctx));
   case OpParamType::HandleTy:
     return getHandleType(Ctx);
   case OpParamType::ResBindTy:
@@ -535,6 +571,10 @@ StructType *DXILOpBuilder::getResRetType(Type *ElementTy) {
   return ::getResRetType(ElementTy);
 }
 
+StructType *DXILOpBuilder::getCBufRetType(Type *ElementTy) {
+  return ::getCBufRetType(ElementTy);
+}
+
 StructType *DXILOpBuilder::getHandleType() {
   return ::getHandleType(IRB.getContext());
 }
diff --git a/llvm/lib/Target/DirectX/DXILOpBuilder.h b/llvm/lib/Target/DirectX/DXILOpBuilder.h
index 5fe9f4429a494..0985f2ee7cf1f 100644
--- a/llvm/lib/Target/DirectX/DXILOpBuilder.h
+++ b/llvm/lib/Target/DirectX/DXILOpBuilder.h
@@ -50,6 +50,9 @@ class DXILOpBuilder {
   /// Get a `%dx.types.ResRet` type with the given element type.
   StructType *getResRetType(Type *ElementTy);
 
+  /// Get a `%dx.types.CBufRet` type with the given element type.
+  StructType *getCBufRetType(Type *ElementTy);
+
   /// Get the `%dx.types.Handle` type.
   StructType *getHandleType();
 
diff --git a/llvm/lib/Target/DirectX/DXILOpLowering.cpp b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
index 83cc4b18824c7..e4239b2bc6628 100644
--- a/llvm/lib/Target/DirectX/DXILOpLowering.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
@@ -569,6 +569,32 @@ class OpLowerer {
     });
   }
 
+  [[nodiscard]] bool lowerCBufferLoad(Function &F) {
+    IRBuilder<> &IRB = OpBuilder.getIRB();
+
+    return replaceFunction(F, [&](CallInst *CI) -> Error {
+      IRB.SetInsertPoint(CI);
+
+      Type *OldTy = cast<StructType>(CI->getType())->getElementType(0);
+      Type *ScalarTy = OldTy->getScalarType();
+      Type *NewRetTy = OpBuilder.getCBufRetType(ScalarTy);
+
+      Value *Handle =
+          createTmpHandleCast(CI->getArgOperand(0), OpBuilder.getHandleType());
+      Value *Index = CI->getArgOperand(1);
+
+      Expected<CallInst *> OpCall = OpBuilder.tryCreateOp(
+          OpCode::CBufferLoadLegacy, {Handle, Index}, CI->getName(), NewRetTy);
+      if (Error E = OpCall.takeError())
+        return E;
+      if (Error E = replaceNamedStructUses(CI, *OpCall))
+        return E;
+
+      CI->eraseFromParent();
+      return Error::success();
+    });
+  }
+
   [[nodiscard]] bool lowerUpdateCounter(Function &F) {
     IRBuilder<> &IRB = OpBuilder.getIRB();
     Type *Int32Ty = IRB.getInt32Ty();
@@ -796,6 +822,11 @@ class OpLowerer {
       case Intrinsic::dx_resource_store_rawbuffer:
         HasErrors |= lowerBufferStore(F, /*IsRaw=*/true);
         break;
+      case Intrinsic::dx_resource_load_cbufferrow_2:
+      case Intrinsic::dx_resource_load_cbufferrow_4:
+      case Intrinsic::dx_resource_load_cbufferrow_8:
+        HasErrors |= lowerCBufferLoad(F);
+        break;
       case Intrinsic::dx_resource_updatecounter:
         HasErrors |= lowerUpdateCounter(F);
         break;
diff --git a/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll
new file mode 100644
index 0000000000000..66dc1d2f36636
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll
@@ -0,0 +1,45 @@
+; We use llc for this test so that we don't abort after the first error.
+; RUN: not llc %s -o /dev/null 2>&1 | FileCheck %s
+
+target triple = "dxil-pc-shadermodel6.6-compute"
+
+declare void @f32_user(float)
+declare void @f64_user(double)
+declare void @f16_user(half)
+
+; CHECK: error:
+; CHECK-SAME: in function four64
+; CHECK-SAME: Type mismatch between intrinsic and DXIL op
+define void @four64() "hlsl.export" {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {double}, 8, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  %load = call {double, double, double, double} @llvm.dx.resource.load.cbufferrow.4(
+      target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {double, double, double, double} %load, 0
+
+  call void @f64_user(double %data)
+
+  ret void
+}
+
+; CHECK: error:
+; CHECK-SAME: in function two32
+; CHECK-SAME: Type mismatch between intrinsic and DXIL op
+define void @two32() "hlsl.export" {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {float}, 4, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  %load = call {float, float} @llvm.dx.resource.load.cbufferrow.2(
+      target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {float, float} %load, 0
+
+  call void @f32_user(float %data)
+
+  ret void
+}
+
+declare { double, double, double, double } @llvm.dx.resource.load.cbufferrow.4.f64.f64.f64.f64.tdx.CBuffer_tdx.Layout_sl_f64s_8_0tt(target("dx.CBuffer", target("dx.Layout", { double }, 8, 0)), i32)
+declare { float, float } @llvm.dx.resource.load.cbufferrow.2.f32.f32.tdx.CBuffer_tdx.Layout_sl_f32s_4_0tt(target("dx.CBuffer", target("dx.Layout", { float }, 4, 0)), i32)
diff --git a/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll
new file mode 100644
index 0000000000000..12b02cfd27823
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll
@@ -0,0 +1,63 @@
+; RUN: opt -S -dxil-op-lower %s | FileCheck %s
+
+target triple = "dxil-pc-shadermodel6.6-compute"
+
+declare void @f32_user(float)
+declare void @f64_user(double)
+declare void @f16_user(half)
+
+; CHECK-LABEL: define void @loadf32
+define void @loadf32() {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {float}, 4, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %{{.*}}, i32 0)
+  %load = call {float, float, float, float} @llvm.dx.resource.load.cbufferrow.4(
+      target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {float, float, float, float} %load, 0
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f32 [[DATA]], 0
+  ; CHECK: call void @f32_user(float [[VAL]])
+  call void @f32_user(float %data)
+
+  ret void
+}
+
+; CHECK-LABEL: define void @loadf64
+define void @loadf64() {
+  %buffer = call
+      target("dx.CBuffer", target("dx.Layout", {double, double, double, double}, 64, 0, 8, 16, 24))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f64 @dx.op.cbufferLoadLegacy.f64(i32 59, %dx.types.Handle %{{.*}}, i32 1)
+  %load = call {double, double} @llvm.dx.resource.load.cbufferrow.2(
+      target("dx.CBuffer", target("dx.Layout", {double, double, double, double}, 64, 0, 8, 16, 24)) %buffer,
+      i32 1)
+  %data = extractvalue {double, double} %load, 1
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f64 [[DATA]], 1
+  ; CHECK: call void @f64_user(double [[VAL]])
+  call void @f64_user(double %data)
+
+  ret void
+}
+
+; CHECK-LABEL: define void @loadf16
+define void @loadf16() {
+  %buffer = call
+      target("dx.CBuffer", target("dx.Layout", {half}, 2, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f16.8 @dx.op.cbufferLoadLegacy.f16(i32 59, %dx.types.Handle %{{.*}}, i32 0)
+  %load = call {half, half, half, half, half, half, half, half} @llvm.dx.resource.load.cbufferrow.8(
+      target("dx.CBuffer", target("dx.Layout", {half}, 2, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {half, half, half, half, half, half, half, half} %load, 0
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f16.8 [[DATA]], 0
+  ; CHECK: call void @f16_user(half [[VAL]])
+  call void @f16_user(half %data)
+
+  ret void
+}
diff --git a/llvm/utils/TableGen/DXILEmitter.cpp b/llvm/utils/TableGen/DXILEmitter.cpp
index 70f2aa6522640..525ad4c4c8529 100644
--- a/llvm/utils/TableGen/DXILEmitter.cpp
+++ b/llvm/utils/TableGen/DXILEmitter.cpp
@@ -228,7 +228,13 @@ static StringRef getOverloadKindStr(const Record *R) {
       .Case("ResRetDoubleTy", "OverloadKind::DOUBLE")
       .Case("ResRetInt16Ty", "OverloadKind::I16")
       .Case("ResRetInt32Ty", "OverloadKind::I32")
-      .Case("ResRetInt64Ty", "OverloadKind::I64");
+      .Case("ResRetInt64Ty", "OverloadKind::I64")
+      .Case("CBufRetHalfTy", "OverloadKind::HALF")
+      .Case("CBufRetFloatTy", "OverloadKind::FLOAT")
+      .Case("CBufRetDoubleTy", "OverloadKind::DOUBLE")
+      .Case("CBufRetInt16Ty", "OverloadKind::I16")
+      .Case("CBufRetInt32Ty", "OverloadKind::I32")
+      .Case("CBufRetInt64Ty", "OverloadKind::I64");
 }
 
 /// Return a string representation of valid overload information denoted

@llvmbot
Copy link
Member

llvmbot commented Feb 25, 2025

@llvm/pr-subscribers-llvm-ir

Author: Justin Bogner (bogner)

Changes

Fixes #112992


Full diff: https://github.com/llvm/llvm-project/pull/128699.diff

9 Files Affected:

  • (modified) llvm/docs/DirectX/DXILResources.rst (+120-6)
  • (modified) llvm/include/llvm/IR/IntrinsicsDirectX.td (+15)
  • (modified) llvm/lib/Target/DirectX/DXIL.td (+19)
  • (modified) llvm/lib/Target/DirectX/DXILOpBuilder.cpp (+40)
  • (modified) llvm/lib/Target/DirectX/DXILOpBuilder.h (+3)
  • (modified) llvm/lib/Target/DirectX/DXILOpLowering.cpp (+31)
  • (added) llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll (+45)
  • (added) llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll (+63)
  • (modified) llvm/utils/TableGen/DXILEmitter.cpp (+7-1)
diff --git a/llvm/docs/DirectX/DXILResources.rst b/llvm/docs/DirectX/DXILResources.rst
index 80e3c2c11153d..91dcd5c8d5214 100644
--- a/llvm/docs/DirectX/DXILResources.rst
+++ b/llvm/docs/DirectX/DXILResources.rst
@@ -277,7 +277,7 @@ Examples:
 Accessing Resources as Memory
 -----------------------------
 
-*relevant types: Buffers, CBuffer, and Textures*
+*relevant types: Buffers and Textures*
 
 Loading and storing from resources is generally represented in LLVM using
 operations on memory that is only accessible via a handle object. Given a
@@ -321,12 +321,11 @@ Examples:
 Loads, Samples, and Gathers
 ---------------------------
 
-*relevant types: Buffers, CBuffers, and Textures*
+*relevant types: Buffers and Textures*
 
-All load, sample, and gather operations in DXIL return a `ResRet`_ type, and
-CBuffer loads return a similar `CBufRet`_ type. These types are structs
-containing 4 elements of some basic type, and in the case of `ResRet` a 5th
-element that is used by the `CheckAccessFullyMapped`_ operation. Some of these
+All load, sample, and gather operations in DXIL return a `ResRet`_ type. These
+types are structs containing 4 elements of some basic type, and a 5th element
+that is used by the `CheckAccessFullyMapped`_ operation. Some of these
 operations, like `RawBufferLoad`_ include a mask and/or alignment that tell us
 some information about how to interpret those four values.
 
@@ -632,3 +631,118 @@ Examples:
        target("dx.RawBuffer", i8, 1, 0, 0) %buffer,
        i32 %index, i32 0, <4 x double> %data)
 
+Constant Buffer Loads
+---------------------
+
+*relevant types: CBuffers*
+
+The `CBufferLoadLegacy`_ operation, which despite the name is the only
+supported way to load from a cbuffer in any DXIL version, loads a single "row"
+of a cbuffer, which is exactly 16 bytes. The return value of the operation is
+represented by a `CBufRet`_ type, which has variants for 2 64-bit values, 4
+32-bit values, and 8 16-bit values.
+
+We represent these in LLVM IR with 3 separate operations, which return a
+2-element, 4-element, or 8-element struct respectively.
+
+.. _CBufferLoadLegacy: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferLoadLegacy
+.. _CBufRet: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#cbufferloadlegacy
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.4``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 4 32-bit values
+     - A single row of a cbuffer, interpreted as 4 32-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {float, float, float, float}
+       @llvm.dx.resource.load.cbufferrow.4(
+           target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+           i32 %index)
+   %ret = call {i32, i32, i32, i32}
+       @llvm.dx.resource.load.cbufferrow.4(
+           target("dx.CBuffer", target("dx.Layout", {i32}, 4, 0)) %buffer,
+           i32 %index)
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.2``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 2 64-bit values
+     - A single row of a cbuffer, interpreted as 2 64-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {double, double}
+       @llvm.dx.resource.load.cbufferrow.2(
+           target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
+           i32 %index)
+   %ret = call {i64, i64}
+       @llvm.dx.resource.load.cbufferrow.2(
+           target("dx.CBuffer", target("dx.Layout", {i64}, 4, 0)) %buffer,
+           i32 %index)
+
+.. list-table:: ``@llvm.dx.resource.load.cbufferrow.8``
+   :header-rows: 1
+
+   * - Argument
+     -
+     - Type
+     - Description
+   * - Return value
+     -
+     - A struct of 8 16-bit values
+     - A single row of a cbuffer, interpreted as 8 16-bit values
+   * - ``%buffer``
+     - 0
+     - ``target(dx.CBuffer, ...)``
+     - The buffer to load from
+   * - ``%index``
+     - 1
+     - ``i32``
+     - Index into the buffer
+
+Examples:
+
+.. code-block:: llvm
+
+   %ret = call {half, half, half, half, half, half, half, half}
+       @llvm.dx.resource.load.cbufferrow.8(
+           target("dx.CBuffer", target("dx.Layout", {half}, 2, 0)) %buffer,
+           i32 %index)
+   %ret = call {i16, i16, i16, i16, i16, i16, i16, i16}
+       @llvm.dx.resource.load.cbufferrow.8(
+           target("dx.CBuffer", target("dx.Layout", {i16}, 2, 0)) %buffer,
+           i32 %index)
diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index beed84b144cec..87de68cb3ad4f 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -45,6 +45,21 @@ def int_dx_resource_store_rawbuffer
           [], [llvm_any_ty, llvm_i32_ty, llvm_i32_ty, llvm_any_ty],
           [IntrWriteMem]>;
 
+// dx.resource.load.cbufferrow encodes the number of elements returned in the
+// function name. The total size of the return should always be 128 bits.
+def int_dx_resource_load_cbufferrow_8
+    : DefaultAttrsIntrinsic<
+          [llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty,
+           llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
+          [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+def int_dx_resource_load_cbufferrow_4
+    : DefaultAttrsIntrinsic<
+          [llvm_any_ty, llvm_any_ty, llvm_any_ty, llvm_any_ty],
+          [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+def int_dx_resource_load_cbufferrow_2
+    : DefaultAttrsIntrinsic<[llvm_any_ty, llvm_any_ty],
+                            [llvm_any_ty, llvm_i32_ty], [IntrReadMem]>;
+
 def int_dx_resource_updatecounter
     : DefaultAttrsIntrinsic<[llvm_i32_ty], [llvm_any_ty, llvm_i8_ty],
                             [IntrInaccessibleMemOrArgMemOnly]>;
diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index d59e28c37b91d..74fd62a028b09 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -46,6 +46,12 @@ def ResRetDoubleTy : DXILOpParamType;
 def ResRetInt16Ty : DXILOpParamType;
 def ResRetInt32Ty : DXILOpParamType;
 def ResRetInt64Ty : DXILOpParamType;
+def CBufRetHalfTy : DXILOpParamType;
+def CBufRetFloatTy : DXILOpParamType;
+def CBufRetDoubleTy : DXILOpParamType;
+def CBufRetInt16Ty : DXILOpParamType;
+def CBufRetInt32Ty : DXILOpParamType;
+def CBufRetInt64Ty : DXILOpParamType;
 def HandleTy : DXILOpParamType;
 def ResBindTy : DXILOpParamType;
 def ResPropsTy : DXILOpParamType;
@@ -816,6 +822,19 @@ def CreateHandle : DXILOp<57, createHandle> {
   let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
 }
 
+def CBufferLoadLegacy : DXILOp<59, cbufferLoadLegacy> {
+  let Doc = "reads from a TypedBuffer";
+  // Handle, Index
+  let arguments = [HandleTy, Int32Ty];
+  let result = OverloadTy;
+  let overloads = [Overloads<DXIL1_0, [
+    CBufRetHalfTy, CBufRetFloatTy, CBufRetDoubleTy, CBufRetInt16Ty,
+    CBufRetInt32Ty, CBufRetInt64Ty
+  ]>];
+  let stages = [Stages<DXIL1_0, [all_stages]>];
+  let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
+}
+
 def BufferLoad : DXILOp<68, bufferLoad> {
   let Doc = "reads from a TypedBuffer";
   // Handle, Coord0, Coord1
diff --git a/llvm/lib/Target/DirectX/DXILOpBuilder.cpp b/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
index 6bbe8d5d12280..f45f86f60100d 100644
--- a/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpBuilder.cpp
@@ -201,6 +201,30 @@ static StructType *getResRetType(Type *ElementTy) {
   return getOrCreateStructType(TypeName, FieldTypes, Ctx);
 }
 
+static StructType *getCBufRetType(Type *ElementTy) {
+  LLVMContext &Ctx = ElementTy->getContext();
+  OverloadKind Kind = getOverloadKind(ElementTy);
+  std::string TypeName = constructOverloadTypeName(Kind, "dx.types.CBufRet.");
+
+  // 64-bit types only have two elements
+  if (ElementTy->isDoubleTy() || ElementTy->isIntegerTy(64))
+    return getOrCreateStructType(
+        TypeName, {ElementTy, ElementTy}, Ctx);
+
+  // 16-bit types pack 8 elements and have .8 in their name to differentiate
+  // from min-precision types.
+  if (ElementTy->isHalfTy() || ElementTy->isIntegerTy(16)) {
+    TypeName += ".8";
+    return getOrCreateStructType(TypeName,
+                                 {ElementTy, ElementTy, ElementTy, ElementTy,
+                                  ElementTy, ElementTy, ElementTy, ElementTy},
+                                 Ctx);
+  }
+
+  return getOrCreateStructType(
+      TypeName, {ElementTy, ElementTy, ElementTy, ElementTy}, Ctx);
+}
+
 static StructType *getHandleType(LLVMContext &Ctx) {
   return getOrCreateStructType("dx.types.Handle", PointerType::getUnqual(Ctx),
                                Ctx);
@@ -265,6 +289,18 @@ static Type *getTypeFromOpParamType(OpParamType Kind, LLVMContext &Ctx,
     return getResRetType(Type::getInt32Ty(Ctx));
   case OpParamType::ResRetInt64Ty:
     return getResRetType(Type::getInt64Ty(Ctx));
+  case OpParamType::CBufRetHalfTy:
+    return getCBufRetType(Type::getHalfTy(Ctx));
+  case OpParamType::CBufRetFloatTy:
+    return getCBufRetType(Type::getFloatTy(Ctx));
+  case OpParamType::CBufRetDoubleTy:
+    return getCBufRetType(Type::getDoubleTy(Ctx));
+  case OpParamType::CBufRetInt16Ty:
+    return getCBufRetType(Type::getInt16Ty(Ctx));
+  case OpParamType::CBufRetInt32Ty:
+    return getCBufRetType(Type::getInt32Ty(Ctx));
+  case OpParamType::CBufRetInt64Ty:
+    return getCBufRetType(Type::getInt64Ty(Ctx));
   case OpParamType::HandleTy:
     return getHandleType(Ctx);
   case OpParamType::ResBindTy:
@@ -535,6 +571,10 @@ StructType *DXILOpBuilder::getResRetType(Type *ElementTy) {
   return ::getResRetType(ElementTy);
 }
 
+StructType *DXILOpBuilder::getCBufRetType(Type *ElementTy) {
+  return ::getCBufRetType(ElementTy);
+}
+
 StructType *DXILOpBuilder::getHandleType() {
   return ::getHandleType(IRB.getContext());
 }
diff --git a/llvm/lib/Target/DirectX/DXILOpBuilder.h b/llvm/lib/Target/DirectX/DXILOpBuilder.h
index 5fe9f4429a494..0985f2ee7cf1f 100644
--- a/llvm/lib/Target/DirectX/DXILOpBuilder.h
+++ b/llvm/lib/Target/DirectX/DXILOpBuilder.h
@@ -50,6 +50,9 @@ class DXILOpBuilder {
   /// Get a `%dx.types.ResRet` type with the given element type.
   StructType *getResRetType(Type *ElementTy);
 
+  /// Get a `%dx.types.CBufRet` type with the given element type.
+  StructType *getCBufRetType(Type *ElementTy);
+
   /// Get the `%dx.types.Handle` type.
   StructType *getHandleType();
 
diff --git a/llvm/lib/Target/DirectX/DXILOpLowering.cpp b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
index 83cc4b18824c7..e4239b2bc6628 100644
--- a/llvm/lib/Target/DirectX/DXILOpLowering.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
@@ -569,6 +569,32 @@ class OpLowerer {
     });
   }
 
+  [[nodiscard]] bool lowerCBufferLoad(Function &F) {
+    IRBuilder<> &IRB = OpBuilder.getIRB();
+
+    return replaceFunction(F, [&](CallInst *CI) -> Error {
+      IRB.SetInsertPoint(CI);
+
+      Type *OldTy = cast<StructType>(CI->getType())->getElementType(0);
+      Type *ScalarTy = OldTy->getScalarType();
+      Type *NewRetTy = OpBuilder.getCBufRetType(ScalarTy);
+
+      Value *Handle =
+          createTmpHandleCast(CI->getArgOperand(0), OpBuilder.getHandleType());
+      Value *Index = CI->getArgOperand(1);
+
+      Expected<CallInst *> OpCall = OpBuilder.tryCreateOp(
+          OpCode::CBufferLoadLegacy, {Handle, Index}, CI->getName(), NewRetTy);
+      if (Error E = OpCall.takeError())
+        return E;
+      if (Error E = replaceNamedStructUses(CI, *OpCall))
+        return E;
+
+      CI->eraseFromParent();
+      return Error::success();
+    });
+  }
+
   [[nodiscard]] bool lowerUpdateCounter(Function &F) {
     IRBuilder<> &IRB = OpBuilder.getIRB();
     Type *Int32Ty = IRB.getInt32Ty();
@@ -796,6 +822,11 @@ class OpLowerer {
       case Intrinsic::dx_resource_store_rawbuffer:
         HasErrors |= lowerBufferStore(F, /*IsRaw=*/true);
         break;
+      case Intrinsic::dx_resource_load_cbufferrow_2:
+      case Intrinsic::dx_resource_load_cbufferrow_4:
+      case Intrinsic::dx_resource_load_cbufferrow_8:
+        HasErrors |= lowerCBufferLoad(F);
+        break;
       case Intrinsic::dx_resource_updatecounter:
         HasErrors |= lowerUpdateCounter(F);
         break;
diff --git a/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll
new file mode 100644
index 0000000000000..66dc1d2f36636
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy-errors.ll
@@ -0,0 +1,45 @@
+; We use llc for this test so that we don't abort after the first error.
+; RUN: not llc %s -o /dev/null 2>&1 | FileCheck %s
+
+target triple = "dxil-pc-shadermodel6.6-compute"
+
+declare void @f32_user(float)
+declare void @f64_user(double)
+declare void @f16_user(half)
+
+; CHECK: error:
+; CHECK-SAME: in function four64
+; CHECK-SAME: Type mismatch between intrinsic and DXIL op
+define void @four64() "hlsl.export" {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {double}, 8, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  %load = call {double, double, double, double} @llvm.dx.resource.load.cbufferrow.4(
+      target("dx.CBuffer", target("dx.Layout", {double}, 8, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {double, double, double, double} %load, 0
+
+  call void @f64_user(double %data)
+
+  ret void
+}
+
+; CHECK: error:
+; CHECK-SAME: in function two32
+; CHECK-SAME: Type mismatch between intrinsic and DXIL op
+define void @two32() "hlsl.export" {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {float}, 4, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  %load = call {float, float} @llvm.dx.resource.load.cbufferrow.2(
+      target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {float, float} %load, 0
+
+  call void @f32_user(float %data)
+
+  ret void
+}
+
+declare { double, double, double, double } @llvm.dx.resource.load.cbufferrow.4.f64.f64.f64.f64.tdx.CBuffer_tdx.Layout_sl_f64s_8_0tt(target("dx.CBuffer", target("dx.Layout", { double }, 8, 0)), i32)
+declare { float, float } @llvm.dx.resource.load.cbufferrow.2.f32.f32.tdx.CBuffer_tdx.Layout_sl_f32s_4_0tt(target("dx.CBuffer", target("dx.Layout", { float }, 4, 0)), i32)
diff --git a/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll
new file mode 100644
index 0000000000000..12b02cfd27823
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/CBufferLoadLegacy.ll
@@ -0,0 +1,63 @@
+; RUN: opt -S -dxil-op-lower %s | FileCheck %s
+
+target triple = "dxil-pc-shadermodel6.6-compute"
+
+declare void @f32_user(float)
+declare void @f64_user(double)
+declare void @f16_user(half)
+
+; CHECK-LABEL: define void @loadf32
+define void @loadf32() {
+  %buffer = call target("dx.CBuffer", target("dx.Layout", {float}, 4, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %{{.*}}, i32 0)
+  %load = call {float, float, float, float} @llvm.dx.resource.load.cbufferrow.4(
+      target("dx.CBuffer", target("dx.Layout", {float}, 4, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {float, float, float, float} %load, 0
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f32 [[DATA]], 0
+  ; CHECK: call void @f32_user(float [[VAL]])
+  call void @f32_user(float %data)
+
+  ret void
+}
+
+; CHECK-LABEL: define void @loadf64
+define void @loadf64() {
+  %buffer = call
+      target("dx.CBuffer", target("dx.Layout", {double, double, double, double}, 64, 0, 8, 16, 24))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f64 @dx.op.cbufferLoadLegacy.f64(i32 59, %dx.types.Handle %{{.*}}, i32 1)
+  %load = call {double, double} @llvm.dx.resource.load.cbufferrow.2(
+      target("dx.CBuffer", target("dx.Layout", {double, double, double, double}, 64, 0, 8, 16, 24)) %buffer,
+      i32 1)
+  %data = extractvalue {double, double} %load, 1
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f64 [[DATA]], 1
+  ; CHECK: call void @f64_user(double [[VAL]])
+  call void @f64_user(double %data)
+
+  ret void
+}
+
+; CHECK-LABEL: define void @loadf16
+define void @loadf16() {
+  %buffer = call
+      target("dx.CBuffer", target("dx.Layout", {half}, 2, 0))
+      @llvm.dx.resource.handlefrombinding(i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; CHECK: [[DATA:%.*]] = call %dx.types.CBufRet.f16.8 @dx.op.cbufferLoadLegacy.f16(i32 59, %dx.types.Handle %{{.*}}, i32 0)
+  %load = call {half, half, half, half, half, half, half, half} @llvm.dx.resource.load.cbufferrow.8(
+      target("dx.CBuffer", target("dx.Layout", {half}, 2, 0)) %buffer,
+      i32 0)
+  %data = extractvalue {half, half, half, half, half, half, half, half} %load, 0
+
+  ; CHECK: [[VAL:%.*]] = extractvalue %dx.types.CBufRet.f16.8 [[DATA]], 0
+  ; CHECK: call void @f16_user(half [[VAL]])
+  call void @f16_user(half %data)
+
+  ret void
+}
diff --git a/llvm/utils/TableGen/DXILEmitter.cpp b/llvm/utils/TableGen/DXILEmitter.cpp
index 70f2aa6522640..525ad4c4c8529 100644
--- a/llvm/utils/TableGen/DXILEmitter.cpp
+++ b/llvm/utils/TableGen/DXILEmitter.cpp
@@ -228,7 +228,13 @@ static StringRef getOverloadKindStr(const Record *R) {
       .Case("ResRetDoubleTy", "OverloadKind::DOUBLE")
       .Case("ResRetInt16Ty", "OverloadKind::I16")
       .Case("ResRetInt32Ty", "OverloadKind::I32")
-      .Case("ResRetInt64Ty", "OverloadKind::I64");
+      .Case("ResRetInt64Ty", "OverloadKind::I64")
+      .Case("CBufRetHalfTy", "OverloadKind::HALF")
+      .Case("CBufRetFloatTy", "OverloadKind::FLOAT")
+      .Case("CBufRetDoubleTy", "OverloadKind::DOUBLE")
+      .Case("CBufRetInt16Ty", "OverloadKind::I16")
+      .Case("CBufRetInt32Ty", "OverloadKind::I32")
+      .Case("CBufRetInt64Ty", "OverloadKind::I64");
 }
 
 /// Return a string representation of valid overload information denoted

Copy link

github-actions bot commented Feb 25, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -46,6 +46,12 @@ def ResRetDoubleTy : DXILOpParamType;
def ResRetInt16Ty : DXILOpParamType;
def ResRetInt32Ty : DXILOpParamType;
def ResRetInt64Ty : DXILOpParamType;
def CBufRetHalfTy : DXILOpParamType;
Copy link
Member

@farzonl farzonl Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why def CBufRetHalfTy : DXILOpParamType; and not def CBufRetHalfTy : HalfTy;?

To phrase another way it looks like you have to do these reassociations later in DXILEmitter via .Case("CBufRetHalfTy", "OverloadKind::HALF") because CBufRetHalfTy is a special type.

Why do the overloads have to be cbuffer types like below?

let overloads = [Overloads<DXIL1_0, [
    CBufRetHalfTy, CBufRetFloatTy, CBufRetDoubleTy, CBufRetInt16Ty,
    CBufRetInt32Ty, CBufRetInt64Ty
  ]>];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dx.op.cbufferLoadLegacy returns a named structure containing elements that are half, float, double, int16, int32, or int64, but it names its overload based on the element type. So we have these types:

; 4 32-bit types
%dx.types.CBufRet.f32 = type { float, float, float, float }
%dx.types.CBufRet.i32 = type { i32, i32, i32, i32 }
; 2 64-bit types
%dx.types.CBufRet.f64 = type { double, double }
%dx.types.CBufRet.i64 = type { i64, i64 }
; 8 16-bit types (If -enable-16bit-types is present)
%dx.types.CBufRet.f16.8 = type { half, half, half, half, half, half, half, half }
%dx.types.CBufRet.i16.8 = type { i16, i16, i16, i16, i16, i16, i16, i16 }

and we have these overloads:

declare %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(...)
declare %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(...)
declare %dx.types.CBufRet.f64 @dx.op.cbufferLoadLegacy.f64(...)
declare %dx.types.CBufRet.i64 @dx.op.cbufferLoadLegacy.i64(...)
declare %dx.types.CBufRet.f16 @dx.op.cbufferLoadLegacy.f16(...)
declare %dx.types.CBufRet.i16 @dx.op.cbufferLoadLegacy.i16(...)

The DXILEmitter association is telling us what to name the overloads, and the overloads are cbuffer types in DXIL.td so that we return these structs and not, say, a single half.

Copy link
Member

@farzonl farzonl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. comments are questions only. No changes needed.

Copy link
Member

@hekota hekota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one copy-paste nit.

@@ -816,6 +822,19 @@ def CreateHandle : DXILOp<57, createHandle> {
let attributes = [Attributes<DXIL1_0, [ReadOnly]>];
}

def CBufferLoadLegacy : DXILOp<59, cbufferLoadLegacy> {
let Doc = "reads from a TypedBuffer";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let Doc = "reads from a TypedBuffer";
let Doc = "reads from a constant buffer";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I updated this to use the doc wording from DXIL.rst

@bogner bogner changed the base branch from users/bogner/128696+128697+128698 to main February 25, 2025 23:24
@bogner bogner force-pushed the 2025-02-25-cbuffer-load-legacy branch from 657eb14 to fdedca0 Compare February 26, 2025 00:05
@bogner bogner merged commit 870b376 into llvm:main Feb 26, 2025
10 of 13 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 26, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve2-vla running on linaro-g4-01 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/198/builds/2343

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'HWAddressSanitizer-aarch64 :: TestCases/hwasan_symbolize_stack_overflow.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: rm -rf /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp; mkdir /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ rm -rf /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
+ mkdir /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp
RUN: at line 2: /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -Wl,--build-id -g /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp -o /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow
RUN: at line 3: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 16
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER0
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
RUN: at line 4: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 17
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
RUN: at line 5: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
RUN: at line 6: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -17
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE17
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
RUN: at line 7: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 1016 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1000
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow 1016
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,AFTER1000
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
RUN: at line 8: env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1000 2>&1 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index | FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1000
+ env HWASAN_OPTIONS=disable_allocator_tagging=1:random_tags=0:fail_without_syscall_abi=0:symbolize=0 not /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp/hwasan_overflow -1000
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/hwasan_symbolize --symbols /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/runtimes/runtimes-bins/compiler-rt/test/hwasan/AARCH64/TestCases/Output/hwasan_symbolize_stack_overflow.cpp.tmp --index
+ FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp --check-prefixes=CHECK,BEFORE1000
Could not find symbols for lib/aarch64-linux-gnu/libc.so.6
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/compiler-rt/test/hwasan/TestCases/hwasan_symbolize_stack_overflow.cpp:21:12: error: CHECK: expected string not found in input
 // CHECK: Potentially referenced stack object:
           ^
<stdin>:1:1: note: scanning from here
==4187497==ERROR: HWAddressSanitizer: tag-mismatch on address 0xffffd33ffd48 at pc 0xb32e9f3ba368
^
<stdin>:10:15: note: possible intended match here
Address 0xffffd33ffd48 is located in stack of thread T0
              ^

...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DirectX] Introduce llvm.dx.resource.load.cbuffer intrinsic and lower it to cbufferLoadLegacy dxil op
6 participants