Skip to content

[mlir][gpu] Add metadata attributes for storing kernel metadata in GPU objects #95292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 27, 2024

Conversation

fabianmcg
Copy link
Contributor

@fabianmcg fabianmcg commented Jun 12, 2024

This patch adds the #gpu.kernel_metadata and #gpu.kernel_table attributes. The #gpu.kernel_metadata attribute allows storing metadata related to a compiled kernel, for example, the number of scalar registers used by the kernel. The attribute only has 2 required parameters, the name and function type. It also has 2 optional parameters, the arguments attributes and generic dictionary for storing all other metadata.

The #gpu.kernel_table stores a table of #gpu.kernel_metadata, mapping the name of the kernel to the metadata.

Finally, the function ROCDL::getAMDHSAKernelsELFMetadata was added to collect ELF metadata from a binary, and to test the class methods in both attributes.

Example:

gpu.binary @binary [#gpu.object<#rocdl.target<chip = "gfx900">, kernels = #gpu.kernel_table<[
    #gpu.kernel_metadata<"kernel0", (i32) -> (), metadata = {sgpr_count = 255}>,
    #gpu.kernel_metadata<"kernel1", (i32, f32) -> (), arg_attrs = [{llvm.read_only}, {}]>
  ]> , bin = "BLOB">]

The motivation behind these attributes is to provide useful information for things like tunning.

@fabianmcg fabianmcg marked this pull request as ready for review June 12, 2024 19:31
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2024

@llvm/pr-subscribers-mlir-spirv
@llvm/pr-subscribers-mlir-gpu
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: Fabian Mora (fabianmcg)

Changes

This patch adds the #rocdl.kernel and #rocdl.object_metadata attributes. The #rocdl.kernel attribute allows storing metadata related to a compiled kernel, for example, the number of scalar registers used by the kernel. It also stores attribute dictionary of the LLVM function used to generate the kernel.

The #rocdl.object_metadata stores a table of #rocdl.kernel, mapping the name of the kernel to the metadata.

Finally, the function ROCDL::getAMDHSAKernelsMetadata was added to collect ELF metadata from a binary. The binary is expected to be complaint with:
https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata

Example:

gpu.binary @<!-- -->binary [#gpu.object&lt;#rocdl.target&lt;chip = "gfx900"&gt;, properties = {
    "rocdl.object_metadata" = #rocdl.object_metadata&lt;{
      kernel0 = #rocdl.kernel&lt;{sym_name = "kernel0", ...}, metadata = {sgpr_count = 255, ...}&gt;,
    }&gt;
  }, bin = "..."&gt;]

The motivation behind these attributes is that they provide useful information for things like tunning.


Patch is 20.87 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95292.diff

8 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLDialect.h (+8)
  • (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+160)
  • (modified) mlir/include/mlir/Target/LLVM/ROCDL/Utils.h (+6)
  • (modified) mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp (+32)
  • (modified) mlir/lib/Target/LLVM/CMakeLists.txt (+2)
  • (added) mlir/lib/Target/LLVM/ROCDL/Utils.cpp (+225)
  • (modified) mlir/test/Dialect/LLVMIR/rocdl.mlir (+7)
  • (modified) mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp (+43)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLDialect.h b/mlir/include/mlir/Dialect/LLVMIR/ROCDLDialect.h
index c2a82ffc1c43c..cf043b2be65bf 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLDialect.h
+++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLDialect.h
@@ -37,4 +37,12 @@
 
 #include "mlir/Dialect/LLVMIR/ROCDLOpsDialect.h.inc"
 
+namespace mlir {
+namespace ROCDL {
+/// Returns the key used for storing the ROCDL metadata dictionary in the
+/// property field dictionary in `#gpu.object`.
+StringRef getROCDLObjectMetadataName();
+} // namespace ROCDL
+} // namespace mlir
+
 #endif /* MLIR_DIALECT_LLVMIR_ROCDLDIALECT_H_ */
diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
index 868208ff74a52..656df3389f4cb 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
@@ -722,4 +722,164 @@ def ROCDL_TargettAttr :
     }
   }];
 }
+
+//===----------------------------------------------------------------------===//
+// ROCDL kernel attribute
+//===----------------------------------------------------------------------===//
+
+def ROCDL_KernelAttr :
+    ROCDL_Attr<"ROCDLKernel", "kernel"> {
+  let description = [{
+    ROCDL attribute for storing metadata related to a compiled kernel. It
+    contains the attribute dictionary of the LLVM function used to generate the
+    kernel, as well as an optional dictionary for additional metadata, like ELF
+    related metadata.
+    For details on the ELF metadata see:
+    https://llvm.org/docs/AMDGPUUsage.html#code-object-v5-metadata
+
+    Examples:
+    ```mlir
+      #rocdl.kernel<{sym_name = "test_fusion__part_0", ...},
+                    metadata = {sgpr_count = 255, ...}>
+    ```
+  }];
+  let parameters = (ins
+    "DictionaryAttr":$func_attrs,
+    OptionalParameter<"DictionaryAttr", "metadata dictionary">:$metadata
+  );
+  let assemblyFormat = [{
+    `<` $func_attrs (`,` `metadata` `=` $metadata^ )? `>`
+  }];
+  let builders = [
+    AttrBuilderWithInferredContext<(ins "DictionaryAttr":$funcAttrs,
+                                         CArg<"DictionaryAttr",
+                                              "nullptr">:$metadata), [{
+      assert(funcAttrs && "invalid function attributes dictionary");
+      return $_get(funcAttrs.getContext(), funcAttrs, metadata);
+    }]>
+  ];
+  let extraClassDeclaration = [{
+    /// Returns the function attribute corresponding to key or nullptr if missing.
+    Attribute getAttr(StringRef key) const {
+      return getFuncAttrs().get(key);
+    }
+    template <typename ConcreteAttr>
+    ConcreteAttr getAttr(StringRef key) const {
+      return llvm::dyn_cast_or_null<ConcreteAttr>(getAttr(key));
+    }
+    Attribute getAttr(StringAttr key) const;
+    template <typename ConcreteAttr>
+    ConcreteAttr getAttr(StringAttr key) const {
+      return llvm::dyn_cast_or_null<ConcreteAttr>(getAttr(key));
+    }
+
+    /// Returns the name of the kernel.
+    StringAttr getName() const {
+      return getAttr<StringAttr>("sym_name");
+    }
+
+    /// Returns the metadta attribute corresponding to key or nullptr if missing.
+    Attribute getMDAttr(StringRef key) const {
+      if (DictionaryAttr attrs = getMetadata())
+        return attrs.get(key);
+      return nullptr;
+    }
+    template <typename ConcreteAttr>
+    ConcreteAttr getMDAttr(StringRef key) const {
+      return llvm::dyn_cast_or_null<ConcreteAttr>(getMDAttr(key));
+    }
+    Attribute getMDAttr(StringAttr key) const;
+    template <typename ConcreteAttr>
+    ConcreteAttr getMDAttr(StringAttr key) const {
+      return llvm::dyn_cast_or_null<ConcreteAttr>(getMDAttr(key));
+    }
+
+    /// Returns the number of required scalar registers, or nullptr if the field
+    /// is missing.
+    IntegerAttr getSGPR() const {
+      return getMDAttr<IntegerAttr>("sgpr_count");
+    }
+
+    /// Returns the number of required scalar registers, or nullptr if the field
+    /// is missing.
+    IntegerAttr getVGPR() const {
+      return getMDAttr<IntegerAttr>("vgpr_count");
+    }
+
+    /// Returns the number of required scalar registers, or nullptr if the field
+    /// is missing.
+    IntegerAttr getAGPR() const {
+      return getMDAttr<IntegerAttr>("agpr_count");
+    }
+
+    /// Returns the number of spilled SGPR, or nullptr if the field is missing.
+    IntegerAttr getSGPRSpill() const {
+      return getMDAttr<IntegerAttr>("sgpr_spill_count");
+    }
+
+    /// Returns the number of spilled VGPR, or nullptr if the field is missing.
+    IntegerAttr getVGPRSpill() const {
+      return getMDAttr<IntegerAttr>("vgpr_spill_count");
+    }
+
+    /// Helper function for appending metadata to a kernel attribute.
+    ROCDLKernelAttr appendMetadata(ArrayRef<NamedAttribute> attrs) const;
+  }];
+}
+
+//===----------------------------------------------------------------------===//
+// ROCDL object metadata
+//===----------------------------------------------------------------------===//
+
+def ROCDL_ObjectMDAttr :
+    ROCDL_Attr<"ROCDLObjectMD", "object_metadata"> {
+  let description = [{
+    ROCDL attribute representing a table of kernels metadata. All the attributes
+    in the dictionary must be of type `#rocdl.kernel`.
+
+    Examples:
+    ```mlir
+      #rocdl.object_metadata<{kernel0 = #rocdl.kernel<...>}>
+    ```
+  }];
+  let parameters = (ins
+    "DictionaryAttr":$kernel_table
+  );
+  let assemblyFormat = [{
+    `<` $kernel_table `>`
+  }];
+  let builders = [
+    AttrBuilderWithInferredContext<(ins "DictionaryAttr":$kernel_table), [{
+      assert(kernel_table && "invalid kernel table");
+      return $_get(kernel_table.getContext(), kernel_table);
+    }]>
+  ];
+  let skipDefaultBuilders = 1;
+  let genVerifyDecl = 1;
+  let extraClassDeclaration = [{
+    /// Helper iterator class for traversing the kernel table.
+    struct KernelIterator
+        : llvm::mapped_iterator_base<KernelIterator,
+                                    llvm::ArrayRef<NamedAttribute>::iterator,
+                                    std::pair<StringAttr, ROCDLKernelAttr>> {
+      using llvm::mapped_iterator_base<
+          KernelIterator, llvm::ArrayRef<NamedAttribute>::iterator,
+          std::pair<StringAttr, ROCDLKernelAttr>>::mapped_iterator_base;
+      /// Map the iterator to the kernel name and a KernelAttribute.
+      std::pair<StringAttr, ROCDLKernelAttr> mapElement(NamedAttribute attr) const {
+        return {attr.getName(), llvm::cast<ROCDLKernelAttr>(attr.getValue())};
+      }
+    };
+    auto begin() const {
+      return KernelIterator(getKernelTable().begin());
+    }
+    auto end() const {
+      return KernelIterator(getKernelTable().end());
+    }
+    size_t size() const {
+      return getKernelTable().size();
+    }
+  }];
+}
+
 #endif // ROCDLIR_OPS
diff --git a/mlir/include/mlir/Target/LLVM/ROCDL/Utils.h b/mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
index 374fa65bd02e3..733d3919a2276 100644
--- a/mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
+++ b/mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
@@ -85,6 +85,12 @@ class SerializeGPUModuleBase : public LLVM::ModuleToObject {
   /// List of LLVM bitcode files to link to.
   SmallVector<std::string> fileList;
 };
+
+/// Returns a dictionary containing kernel metadata for each of the kernels in
+/// `gpuModule`. If `elfData` is valid, then the `amdhsa.kernels` ELF metadata
+/// will be added to the dictionary.
+ROCDLObjectMDAttr getAMDHSAKernelsMetadata(Operation *gpuModule,
+                                           ArrayRef<char> elfData = {});
 } // namespace ROCDL
 } // namespace mlir
 
diff --git a/mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp b/mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp
index 0c9c61fad1363..fde5bd1b26cf6 100644
--- a/mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp
+++ b/mlir/lib/Dialect/LLVMIR/IR/ROCDLDialect.cpp
@@ -217,6 +217,7 @@ LogicalResult ROCDLDialect::verifyOperationAttribute(Operation *op,
 //===----------------------------------------------------------------------===//
 // ROCDL target attribute.
 //===----------------------------------------------------------------------===//
+
 LogicalResult
 ROCDLTargetAttr::verify(function_ref<InFlightDiagnostic()> emitError,
                         int optLevel, StringRef triple, StringRef chip,
@@ -247,6 +248,37 @@ ROCDLTargetAttr::verify(function_ref<InFlightDiagnostic()> emitError,
   return success();
 }
 
+//===----------------------------------------------------------------------===//
+// ROCDL object metadata
+//===----------------------------------------------------------------------===//
+
+StringRef mlir::ROCDL::getROCDLObjectMetadataName() {
+  return "rocdl.object_metadata";
+}
+
+ROCDLKernelAttr
+ROCDLKernelAttr::appendMetadata(ArrayRef<NamedAttribute> attrs) const {
+  if (attrs.empty())
+    return *this;
+  NamedAttrList attrList(attrs);
+  attrList.append(getMetadata());
+  return ROCDLKernelAttr::get(getFuncAttrs(),
+                              attrList.getDictionary(getContext()));
+}
+
+LogicalResult
+ROCDLObjectMDAttr::verify(function_ref<InFlightDiagnostic()> emitError,
+                          DictionaryAttr dict) {
+  if (!dict)
+    return emitError() << "table cannot be null";
+  if (llvm::any_of(dict, [](NamedAttribute attr) {
+        return !llvm::isa<ROCDLKernelAttr>(attr.getValue());
+      }))
+    return emitError()
+           << "all the dictionary values must be `#rocdl.kernel` attributes";
+  return success();
+}
+
 #define GET_OP_CLASSES
 #include "mlir/Dialect/LLVMIR/ROCDLOps.cpp.inc"
 
diff --git a/mlir/lib/Target/LLVM/CMakeLists.txt b/mlir/lib/Target/LLVM/CMakeLists.txt
index 5a3fa160850b4..08da4966e1499 100644
--- a/mlir/lib/Target/LLVM/CMakeLists.txt
+++ b/mlir/lib/Target/LLVM/CMakeLists.txt
@@ -108,9 +108,11 @@ endif()
 
 add_mlir_dialect_library(MLIRROCDLTarget
   ROCDL/Target.cpp
+  ROCDL/Utils.cpp
 
   LINK_COMPONENTS
   MCParser
+  Object
   ${AMDGPU_LIBS}
 
   LINK_LIBS PUBLIC
diff --git a/mlir/lib/Target/LLVM/ROCDL/Utils.cpp b/mlir/lib/Target/LLVM/ROCDL/Utils.cpp
new file mode 100644
index 0000000000000..fa3b49d94e414
--- /dev/null
+++ b/mlir/lib/Target/LLVM/ROCDL/Utils.cpp
@@ -0,0 +1,225 @@
+//===- Utils.cpp - MLIR ROCDL target utils ----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This files defines ROCDL target related utility classes and functions.
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Target/LLVM/ROCDL/Utils.h"
+#include "mlir/Dialect/GPU/IR/GPUDialect.h"
+#include "mlir/Dialect/LLVMIR/ROCDLDialect.h"
+
+#include "llvm/ADT/StringMap.h"
+#include "llvm/BinaryFormat/MsgPackDocument.h"
+#include "llvm/Object/ELFObjectFile.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Support/AMDGPUMetadata.h"
+
+using namespace mlir;
+using namespace mlir::ROCDL;
+
+/// Search the ELF object and return an object containing the `amdhsa.kernels`
+/// metadata note. Function adapted from:
+/// llvm-project/llvm/tools/llvm-readobj/ELFDumper.cpp Also see
+/// `amdhsa.kernels`:
+/// https://llvm.org/docs/AMDGPUUsage.html#code-object-v3-metadata
+template <typename ELFT>
+static std::optional<llvm::msgpack::Document>
+getAMDHSANote(llvm::object::ELFObjectFile<ELFT> &elfObj) {
+  using namespace llvm;
+  using namespace llvm::object;
+  using namespace llvm::ELF;
+  const ELFFile<ELFT> &elf = elfObj.getELFFile();
+  auto secOrErr = elf.sections();
+  if (!secOrErr)
+    return std::nullopt;
+  ArrayRef<typename ELFT::Shdr> sections = *secOrErr;
+  for (auto section : sections) {
+    if (section.sh_type != ELF::SHT_NOTE)
+      continue;
+    size_t align = std::max(static_cast<unsigned>(section.sh_addralign), 4u);
+    Error err = Error::success();
+    for (const typename ELFT::Note note : elf.notes(section, err)) {
+      StringRef name = note.getName();
+      if (name != "AMDGPU")
+        continue;
+      uint32_t type = note.getType();
+      if (type != ELF::NT_AMDGPU_METADATA)
+        continue;
+      ArrayRef<uint8_t> desc = note.getDesc(align);
+      StringRef msgPackString =
+          StringRef(reinterpret_cast<const char *>(desc.data()), desc.size());
+      msgpack::Document msgPackDoc;
+      if (!msgPackDoc.readFromBlob(msgPackString, /*Multi=*/false))
+        return std::nullopt;
+      if (msgPackDoc.getRoot().isScalar())
+        return std::nullopt;
+      return std::optional<llvm::msgpack::Document>(std::move(msgPackDoc));
+    }
+  }
+  return std::nullopt;
+}
+
+/// Return the `amdhsa.kernels` metadata in the ELF object or std::nullopt on
+/// failure. This is a helper function that casts a generic `ObjectFile` to the
+/// appropiate `ELFObjectFile`.
+static std::optional<llvm::msgpack::Document>
+getAMDHSANote(ArrayRef<char> elfData) {
+  using namespace llvm;
+  using namespace llvm::object;
+  if (elfData.empty())
+    return std::nullopt;
+  MemoryBufferRef buffer(StringRef(elfData.data(), elfData.size()), "buffer");
+  Expected<std::unique_ptr<ObjectFile>> objOrErr =
+      ObjectFile::createELFObjectFile(buffer);
+  if (!objOrErr || !objOrErr.get()) {
+    // Drop the error.
+    llvm::consumeError(objOrErr.takeError());
+    return std::nullopt;
+  }
+  ObjectFile &elf = *(objOrErr.get());
+  std::optional<llvm::msgpack::Document> metadata;
+  if (auto *obj = dyn_cast<ELF32LEObjectFile>(&elf))
+    metadata = getAMDHSANote(*obj);
+  else if (auto *obj = dyn_cast<ELF32BEObjectFile>(&elf))
+    metadata = getAMDHSANote(*obj);
+  else if (auto *obj = dyn_cast<ELF64LEObjectFile>(&elf))
+    metadata = getAMDHSANote(*obj);
+  else if (auto *obj = dyn_cast<ELF64BEObjectFile>(&elf))
+    metadata = getAMDHSANote(*obj);
+  return metadata;
+}
+
+/// Utility functions for converting `llvm::msgpack::DocNode` nodes.
+static Attribute convertNode(Builder &builder, llvm::msgpack::DocNode &node);
+static Attribute convertNode(Builder &builder,
+                             llvm::msgpack::MapDocNode &node) {
+  NamedAttrList attrs;
+  for (auto kv : node) {
+    if (!kv.first.isString())
+      continue;
+    if (Attribute attr = convertNode(builder, kv.second)) {
+      auto key = kv.first.getString();
+      key.consume_front(".");
+      key.consume_back(".");
+      attrs.append(key, attr);
+    }
+  }
+  if (attrs.empty())
+    return nullptr;
+  return builder.getDictionaryAttr(attrs);
+}
+
+static Attribute convertNode(Builder &builder,
+                             llvm::msgpack::ArrayDocNode &node) {
+  using NodeKind = llvm::msgpack::Type;
+  // Use `DenseIntAttr` if we know all the attrs are ints.
+  if (llvm::all_of(node, [](llvm::msgpack::DocNode &n) {
+        auto kind = n.getKind();
+        return kind == NodeKind::Int || kind == NodeKind::UInt;
+      })) {
+    SmallVector<int64_t> values;
+    for (llvm::msgpack::DocNode &n : node) {
+      auto kind = n.getKind();
+      if (kind == NodeKind::Int)
+        values.push_back(n.getInt());
+      else if (kind == NodeKind::UInt)
+        values.push_back(n.getUInt());
+    }
+    return builder.getDenseI64ArrayAttr(values);
+  }
+  // Convert the array.
+  SmallVector<Attribute> attrs;
+  for (llvm::msgpack::DocNode &n : node) {
+    if (Attribute attr = convertNode(builder, n))
+      attrs.push_back(attr);
+  }
+  if (attrs.empty())
+    return nullptr;
+  return builder.getArrayAttr(attrs);
+}
+
+static Attribute convertNode(Builder &builder, llvm::msgpack::DocNode &node) {
+  using namespace llvm::msgpack;
+  using NodeKind = llvm::msgpack::Type;
+  switch (node.getKind()) {
+  case NodeKind::Int:
+    return builder.getI64IntegerAttr(node.getInt());
+  case NodeKind::UInt:
+    return builder.getI64IntegerAttr(node.getUInt());
+  case NodeKind::Boolean:
+    return builder.getI64IntegerAttr(node.getBool());
+  case NodeKind::String:
+    return builder.getStringAttr(node.getString());
+  case NodeKind::Array:
+    return convertNode(builder, node.getArray());
+  case NodeKind::Map:
+    return convertNode(builder, node.getMap());
+  default:
+    return nullptr;
+  }
+}
+
+/// The following function should succeed for Code object V3 and above.
+static llvm::StringMap<DictionaryAttr> getELFMetadata(Builder &builder,
+                                                      ArrayRef<char> elfData) {
+  std::optional<llvm::msgpack::Document> metadata = getAMDHSANote(elfData);
+  if (!metadata)
+    return {};
+  llvm::StringMap<DictionaryAttr> kernelMD;
+  llvm::msgpack::DocNode &root = (metadata)->getRoot();
+  // Fail if `root` is not a map -it should be for AMD Obj Ver 3.
+  if (!root.isMap())
+    return kernelMD;
+  auto &kernels = root.getMap()["amdhsa.kernels"];
+  // Fail if `amdhsa.kernels` is not an array.
+  if (!kernels.isArray())
+    return kernelMD;
+  // Convert each of the kernels.
+  for (auto &kernel : kernels.getArray()) {
+    if (!kernel.isMap())
+      continue;
+    auto &kernelMap = kernel.getMap();
+    auto &name = kernelMap[".name"];
+    if (!name.isString())
+      continue;
+    NamedAttrList attrList;
+    // Convert the kernel properties.
+    for (auto kv : kernelMap) {
+      if (!kv.first.isString())
+        continue;
+      StringRef key = kv.first.getString();
+      key.consume_front(".");
+      key.consume_back(".");
+      if (key == "name")
+        continue;
+      if (Attribute attr = convertNode(builder, kv.second))
+        attrList.append(key, attr);
+    }
+    if (!attrList.empty())
+      kernelMD[name.getString()] = builder.getDictionaryAttr(attrList);
+  }
+  return kernelMD;
+}
+
+ROCDLObjectMDAttr
+mlir::ROCDL::getAMDHSAKernelsMetadata(Operation *gpuModule,
+                                      ArrayRef<char> elfData) {
+  auto module = cast<gpu::GPUModuleOp>(gpuModule);
+  Builder builder(module.getContext());
+  NamedAttrList moduleAttrs;
+  llvm::StringMap<DictionaryAttr> mdMap = getELFMetadata(builder, elfData);
+  for (auto funcOp : module.getBody()->getOps<LLVM::LLVMFuncOp>()) {
+    if (!funcOp->getDiscardableAttr("rocdl.kernel"))
+      continue;
+    moduleAttrs.append(funcOp.getName(),
+                       ROCDLKernelAttr::get(funcOp->getAttrDictionary(),
+                                            mdMap.lookup(funcOp.getName())));
+  }
+  return ROCDLObjectMDAttr::get(moduleAttrs.getDictionary(module.getContext()));
+}
diff --git a/mlir/test/Dialect/LLVMIR/rocdl.mlir b/mlir/test/Dialect/LLVMIR/rocdl.mlir
index f5dd5721c45e6..e7c3206b4d6db 100644
--- a/mlir/test/Dialect/LLVMIR/rocdl.mlir
+++ b/mlir/test/Dialect/LLVMIR/rocdl.mlir
@@ -365,3 +365,10 @@ gpu.module @module_1 [#rocdl.target<O = 1, chip = "gfx900", abi = "500", link =
 
 gpu.module @module_2 [#rocdl.target<chip = "gfx900">, #rocdl.target<chip = "gfx90a">] {
 }
+
+gpu.binary @binary [#gpu.object<#rocdl.target<chip = "gfx900">, properties = {
+    "rocdl.object_metadata" = #rocdl.object_metadata<{
+      kernel0 = #rocdl.kernel<{sym_name = "kernel0"}, metadata = {sgpr_count = 255}>,
+      kernel1 = #rocdl.kernel<{sym_name = "kernel1"}>
+    }>
+  }, bin = "BLOB">]
diff --git a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
index 33291bc4bcaed..fa9b752daaca4 100644
--- a/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
+++ b/mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
@@ -158,3 +158,46 @@ TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(SerializeROCDLToBinary)) {
     ASSERT_FALSE(object->empty());
   }
 }
+
+// Test ROCDL serialization to Binary.
+TEST_F(MLIRTargetLLVMROCDL, SKIP_WITHOUT_AMDGPU(GetELFMetadata)) {
+  if (!hasROCMTools())
+    GTEST_SKIP() << "ROCm installation not found, skipping test.";
+
+  MLIRContext context(registry);
+
+  OwningOpRef<ModuleOp> module =
+      parseSourceString<ModuleOp>(moduleStr, &context);
+  ASSERT_TRUE(!!module);
+
+  // Create a ROCDL target.
+  ROCDL::ROCDLTargetAttr target = ROCDL::ROCDLTargetAttr::get(&context);
+
+  // Serialize the module.
+  auto serializer = dyn_cast<gpu::TargetAttrInterface>(target);
+  ASSERT_TRUE(!!serializer);
+  gpu::TargetOptions options("", {}, "", gpu::CompilationTarget::Binary);
+  for (auto gpuModule : (*module).getBody()->getOps<gpu::GPUModuleOp>()) ...
[truncated]

@fabianmcg fabianmcg marked this pull request as draft June 12, 2024 20:01
@fabianmcg fabianmcg force-pushed the pr-rocdl-metadata branch from 4d9ab5a to 7be061e Compare June 12, 2024 20:49
@fabianmcg fabianmcg changed the title [mlir][ROCDL] Add metadata attributes for storing ELF object information [mlir][gpu] Add metadata attributes for storing kernel metadata in GPU objects Jun 12, 2024
@fabianmcg fabianmcg marked this pull request as ready for review June 12, 2024 20:53
@fabianmcg fabianmcg requested a review from joker-eph June 12, 2024 20:53
@joker-eph
Copy link
Collaborator

Extracting things from the compilation process and adding them as metadata seems fine, just like keeping around parameters used for the compilation process. But I'm not sure about this though:

It also stores the attribute dictionary of the LLVM function used to generate the kernel.

This is basically a blob of random things, most of these may have no relationship to the end result.

@fabianmcg
Copy link
Contributor Author

This is basically a blob of random things, most of these may have no relationship to the end result.

You're right, most of them are not useful. However, the function type, the kernel name and the argument attributes are useful. For example, inside the argument attributes you could have metadata indicating that an argument has to be prefilled with zeros.

My reasoning behind the 2 dicts is having a clear divide between function attributes and attributes that come from other metadata.

@joker-eph
Copy link
Collaborator

However, the function type, the kernel name and the argument attributes are useful. For example, inside the argument attributes you could have metadata indicating that an argument has to be prefilled with zeros.

I don't believe this should be done through discardable attributes: this is a semantically important setup.

@fabianmcg
Copy link
Contributor Author

fabianmcg commented Jun 13, 2024

I don't believe this should be done through discardable attributes: this is a semantically important setup.

They are not discardable, they are being stored inside #gpu.object, and those are stored as inherent attributes in the gpu.binary op.

I'll add more explicitly into the description that these attributes live inside #gpu.objects into the description.

Edit:
I think I got your point, those should be explicit parameters in #gpu.kernel. is that what you mean?

@joker-eph
Copy link
Collaborator

I think I got your point, those should be explicit parameters in #gpu.kernel. is that what you mean?

Right, I was pointing at the "input" and I think your answer described the "output" right"?
I'm trying to avoid a complete default that just pass-through all discardable attributes.

@fabianmcg
Copy link
Contributor Author

fabianmcg commented Jun 13, 2024

Right, I was pointing at the "input" and I think your answer described the "output" right"?
I'm trying to avoid a complete default that just pass-through all discardable attributes.

Right now they are just 2 dict attrs wrapped inside #gpu.kernel, so they don't default to all attributes, it just allows to have as many attributes as possible -including all function attributes.

However, I agree that abetter design is making several fields, like symbol name, type, etc mandatory and then a generic dictionary with optional metadata.

My idea is to add in a future PR, attribute interfaces to NVVM and ROCDL to extract that optional metadata depending on the target. I'm just trying to avoid creating the same attribute in NVVM and ROCDL.

@krzysz00
Copy link
Contributor

Re passing through discardable attributes, I'd argue that "if you don't know what that annotation it, copy it verbatim" is a sensible behavior for function -> function translations (it's what going from gpu.func to llvm.func does)

(But there's also stuff like "this argument maps 1-to-N so I'm tossing argument attributes I don't understand because I don't know what to do with them in that case")

@@ -37,6 +37,11 @@ MLIR_CAPI_EXPORTED MlirAttribute
mlirGPUObjectAttrGet(MlirContext mlirCtx, MlirAttribute target, uint32_t format,
MlirStringRef objectStrRef, MlirAttribute mlirObjectProps);

MLIR_CAPI_EXPORTED MlirAttribute mlirGPUObjectAttrGetWithKernels(
MlirContext mlirCtx, MlirAttribute target, uint32_t format,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is carried from above, but uint32_t for what looks like an enum is unfortunate.

fabianmcg and others added 2 commits August 27, 2024 15:47
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
fabianmcg and others added 2 commits August 27, 2024 15:54
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Copy link

github-actions bot commented Aug 27, 2024

✅ With the latest revision this PR passed the Python code formatter.

@fabianmcg fabianmcg merged commit 016e1eb into llvm:main Aug 27, 2024
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 28, 2024

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building mlir at step 5 "build-check-mlir-build-only".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/2762

Here is the relevant piece of the build log for the reference
Step 5 (build-check-mlir-build-only) failure: build (failure)
...
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/examples/transform/Ch4/lib/MyExtension.cpp:66:31: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/examples/transform/Ch4/lib/MyExtension.cpp:66:31: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
173.812 [51/16/4462] Linking CXX static library lib/libMyExtensionCh4.a
174.036 [50/16/4463] Linking CXX executable bin/mlir-minimal-opt
177.005 [49/16/4464] Linking CXX executable bin/mlir-transform-opt
177.080 [48/16/4465] Building CXX object tools/llc/CMakeFiles/llc.dir/llc.cpp.o
177.218 [47/16/4466] Building CXX object tools/llc/CMakeFiles/llc.dir/NewPMDriver.cpp.o
178.564 [46/16/4467] Linking CXX executable bin/llc
178.727 [45/16/4468] Building CXX object tools/lli/CMakeFiles/lli.dir/lli.cpp.o
180.014 [44/16/4469] Linking CXX executable bin/lli
command timed out: 1200 seconds without output running [b'ninja', b'-j', b'16', b'check-mlir-build-only'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=3019.737740

@fabianmcg fabianmcg deleted the pr-rocdl-metadata branch August 31, 2024 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants