forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
RFC: Add xegpu transform ops #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
85e6478
add xegpu transform ops
tkarna 98d5f0a
xegpu: add xegpu transform op python bindinds
tkarna d73ef0d
xegpu: drop XeGPU prefix from transform op names
tkarna bf7cf0a
xegpu: remove load_data argument from set_dpas_layout transform op
tkarna a45730c
xegpu: rename set_dpas_layout to set_operand_layout
tkarna 8b11bfd
xegpu: add tests transform op python bindings
tkarna 018491e
xegpu: code formatting
tkarna File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,3 @@ | ||
| add_subdirectory(IR) | ||
| add_subdirectory(Transforms) | ||
| add_subdirectory(TransformOps) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| set(LLVM_TARGET_DEFINITIONS XeGPUTransformOps.td) | ||
| mlir_tablegen(XeGPUTransformOps.h.inc -gen-op-decls) | ||
| mlir_tablegen(XeGPUTransformOps.cpp.inc -gen-op-defs) | ||
| add_public_tablegen_target(MLIRXeGPUTransformOpsIncGen) | ||
|
|
||
| add_mlir_doc(XeGPUTransformOps XeGPUTransformOps Dialects/ -gen-op-doc) |
29 changes: 29 additions & 0 deletions
29
mlir/include/mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.h
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| //===- XeGPUTransformOps.h - XeGPU transformation ops -----------*- C++ -*-===// | ||
| // | ||
| // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
| // See https://llvm.org/LICENSE.txt for license information. | ||
| // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #ifndef MLIR_DIALECT_XEGPU_TRANSFORMOPS_XEGPUTRANSFORMOPS_H | ||
| #define MLIR_DIALECT_XEGPU_TRANSFORMOPS_XEGPUTRANSFORMOPS_H | ||
|
|
||
| #include "mlir/Bytecode/BytecodeOpInterface.h" | ||
| #include "mlir/Dialect/SCF/IR/SCF.h" | ||
| #include "mlir/Dialect/Transform/IR/TransformDialect.h" | ||
| #include "mlir/Dialect/Transform/IR/TransformTypes.h" | ||
| #include "mlir/Dialect/Transform/Interfaces/TransformInterfaces.h" | ||
|
|
||
| #define GET_OP_CLASSES | ||
| #include <mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.h.inc> | ||
|
|
||
| namespace mlir { | ||
| class DialectRegistry; | ||
|
|
||
| namespace xegpu { | ||
| void registerTransformDialectExtension(DialectRegistry ®istry); | ||
| } // namespace xegpu | ||
| } // namespace mlir | ||
|
|
||
| #endif // MLIR_DIALECT_XEGPU_TRANSFORMOPS_XEGPUTRANSFORMOPS_H |
128 changes: 128 additions & 0 deletions
128
mlir/include/mlir/Dialect/XeGPU/TransformOps/XeGPUTransformOps.td
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| //===- XeGPUTransformOps.td - XeGPU transformation ops -----*- tablegen -*-===// | ||
| // | ||
| // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
| // See https://llvm.org/LICENSE.txt for license information. | ||
| // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #ifndef XEGPU_EXTENSION | ||
| #define XEGPU_EXTENSION | ||
|
|
||
| include "mlir/Dialect/Transform/IR/TransformDialect.td" | ||
| include "mlir/Dialect/Transform/Interfaces/TransformInterfaces.td" | ||
| include "mlir/Dialect/Transform/IR/TransformTypes.td" | ||
| include "mlir/IR/OpBase.td" | ||
| include "mlir/Interfaces/SideEffectInterfaces.td" | ||
|
|
||
| def HoistDescOp : Op<Transform_Dialect, "xegpu.hoist_desc_ops", [ | ||
| TransformOpInterface, TransformEachOpTrait, | ||
| DeclareOpInterfaceMethods<MemoryEffectsOpInterface> | ||
| ]> { | ||
|
|
||
| let summary = "Hoists xegpu tile descriptor ops outside the containing loop"; | ||
| let description = [{ | ||
| Hoists `xepu.create_nd_tdesc` out of the loop. If the | ||
| descriptor's offset is loop dependent, a `xegpu.update_nd_offset` op is | ||
| inserted in the loop to increment the offset. | ||
| }]; | ||
|
|
||
| let arguments = (ins TransformHandleTypeInterface : $loop); | ||
| let results = (outs TransformHandleTypeInterface : $transformed); | ||
|
|
||
| let assemblyFormat = "$loop attr-dict `:` functional-type(operands, results)"; | ||
|
|
||
| let extraClassDeclaration = [{ | ||
| ::mlir::DiagnosedSilenceableFailure applyToOne( | ||
| ::mlir::transform::TransformRewriter & rewriter, | ||
| ::mlir::Operation * target, | ||
| ::mlir::transform::ApplyToEachResultList & results, | ||
| ::mlir::transform::TransformState & state); | ||
| }]; | ||
| } | ||
|
|
||
| def SetOperandLayoutOp : Op<Transform_Dialect, "xegpu.set_operand_layout", [ | ||
| TransformOpInterface, TransformEachOpTrait, | ||
| DeclareOpInterfaceMethods<MemoryEffectsOpInterface> | ||
| ]> { | ||
|
|
||
| let summary = "Set xegpu.layout attribute to an xegpu op operand."; | ||
| let description = [{ | ||
| Given an xegpu operation, this transform adds `xegpu.layout` | ||
| attribute to it's operand's tensor descriptor. The target operand is | ||
| defined by the `operandIndex` argument. The layout is defined by the | ||
| `sg_layout`, `sg_data` and `inst_data` attributes. | ||
| }]; | ||
|
|
||
| let arguments = (ins TransformHandleTypeInterface : $target, | ||
| I64Attr : $operandIndex, | ||
| DenseI32ArrayAttr : $sgLayout, | ||
| DenseI32ArrayAttr : $sgData, | ||
| DenseI32ArrayAttr : $instData); | ||
|
|
||
| let results = (outs); | ||
|
|
||
| let assemblyFormat = | ||
| "$target `index` `=` $operandIndex `sg_layout` `=` $sgLayout `sg_data` `=` " | ||
| "$sgData `inst_data` `=` $instData attr-dict `:` type($target)"; | ||
|
|
||
| let extraClassDeclaration = [{ | ||
| ::mlir::DiagnosedSilenceableFailure applyToOne( | ||
| ::mlir::transform::TransformRewriter & rewriter, | ||
| ::mlir::Operation * target, | ||
| ::mlir::transform::ApplyToEachResultList & results, | ||
| ::mlir::transform::TransformState & state); | ||
| }]; | ||
| } | ||
|
|
||
| def InsertPrefetchOp : Op<Transform_Dialect, "xegpu.insert_prefetch", | ||
| [FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface, | ||
| DeclareOpInterfaceMethods<TransformOpInterface>]> { | ||
|
|
||
| let summary = "Adds xegpu prefetch ops to matmul operand tiles."; | ||
| let description = [{ | ||
| Given an xegpu operation residing in a `scf.for` loop, this transform inserts cooperative `xegpu.prefetch` operations for the A (index = 0) or B (index = 1) operand. The prefetch tile size is determined by the `sg_layout` and `sg_data` attributes. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean the input is a xegpu DPAS op? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the implementation only supports DPAS op at the moment. |
||
| }]; | ||
|
|
||
| let arguments = (ins TransformHandleTypeInterface : $target, | ||
| TransformHandleTypeInterface : $loopOp, | ||
| I64Attr : $operandIndex, | ||
| DenseI32ArrayAttr : $sgLayout, | ||
| DenseI32ArrayAttr : $sgData); | ||
|
|
||
| let results = (outs TransformHandleTypeInterface : $transformedTargetOp, | ||
| TransformHandleTypeInterface : $transformedLoopOp); | ||
|
|
||
| let assemblyFormat = | ||
| "$target $loopOp `index` `=` $operandIndex `sg_layout` `=` $sgLayout `sg_data` `=` " | ||
| "$sgData attr-dict `:` functional-type(operands, results)"; | ||
| } | ||
|
|
||
| // TODO this should be handled with gpu transform ops. | ||
| // Add gpu mapping to scf.forall op and use something like | ||
| // transform.gpu.map_forall_to_blocks to convert to gpu.launch op. | ||
| def SetGPULaunchThreadsOp | ||
| : Op<Transform_Dialect, "xegpu.set_gpu_launch_threads", [ | ||
| TransformOpInterface, TransformEachOpTrait, | ||
| DeclareOpInterfaceMethods<MemoryEffectsOpInterface> | ||
| ]> { | ||
|
|
||
| let summary = "Set number of threads for a given gpu.launch operation"; | ||
| let description = [{Set number of threads for a given gpu.launch operation}]; | ||
|
|
||
| let arguments = (ins TransformHandleTypeInterface : $launchOp, | ||
| DenseI32ArrayAttr : $threads); | ||
| let results = (outs); | ||
| let assemblyFormat = | ||
| "$launchOp `threads` `=` $threads attr-dict `:` type($launchOp)"; | ||
|
|
||
| let extraClassDeclaration = [{ | ||
| ::mlir::DiagnosedSilenceableFailure applyToOne( | ||
| ::mlir::transform::TransformRewriter & rewriter, | ||
| ::mlir::Operation * target, | ||
| ::mlir::transform::ApplyToEachResultList & results, | ||
| ::mlir::transform::TransformState & state); | ||
| }]; | ||
| } | ||
|
|
||
| #endif // XEGPU_EXTENSION | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,4 @@ | ||
| add_subdirectory(IR) | ||
| add_subdirectory(Transforms) | ||
| add_subdirectory(Utils) | ||
| add_subdirectory(TransformOps) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| add_mlir_dialect_library(MLIRXeGPUTransformOps | ||
| XeGPUTransformOps.cpp | ||
|
|
||
| ADDITIONAL_HEADER_DIRS | ||
| ${PROJECT_SOURCE_DIR}/mlir/Dialect/XeGPU/TransformOps/ | ||
|
|
||
| DEPENDS | ||
| MLIRXeGPUTransformOpsIncGen | ||
|
|
||
| LINK_LIBS PUBLIC | ||
| MLIRXeGPUDialect | ||
| MLIRXeGPUTransforms | ||
| MLIRIR | ||
| MLIRTransformDialect | ||
| MLIRFuncDialect | ||
| MLIRSCFDialect | ||
| ) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pass may become unnecessary as we are transitioning to a new create_nd_tdesc definition: nd_tdesc created without offset and move offset to load_nd. Create_nd_tdesc would become loop_invariant.
Referring to this PRs:
a.1. make offset option for create_nd_tdesc (llvm#148335)
a.2. add optional offsets for load_nd and store_nd/prefetch_nd. (llvm#149424)
You may look at Imex innersource github issue#1151 for more background info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes I'm aware of this planned change. It implies some changes to the transform ops - it should in fact make the logic simpler in most cases. Hoisting the desc ops is still needed but indeed we might be able to use existing hoist patterns instead of an xegpu specific method. We can address this issue once the new load_nd-offset pipeline is complete. In the meantime, on my behalf, we could upstream these transform ops so that we can support linalg.matmul lowering.