-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[MLIR][XeGPU] Add unroll patterns and blocking pass for XeGPU (1/N) #137010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
7d332da
d4549ad
cdd5059
47f9b3d
932747e
f843d98
c6bdd3c
1d4dc72
545f937
008dbc7
d077cb0
0193a04
7f8b00a
456465e
906d699
c63a496
e2ed1ac
35b35f0
6fef430
9d24920
1a92661
0126eb9
a7d0614
01ca783
ec74833
68f95f0
9e6cf29
15b1b46
727390f
76f8761
45a3d28
372dbd7
06cf9b2
e0399ac
e873d59
b55f43b
383bd1d
4fc35cf
536a610
39ca440
09cec0b
1d3d12c
96cb62b
163204a
1caac76
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,11 +14,67 @@ class RewritePatternSet; | |
|
||
namespace xegpu { | ||
|
||
/// Options to control the XeGPU unrolling. Its main purpose is to | ||
/// provide a way to customize the native shape of the operation. | ||
struct UnrollOptions { | ||
/// Callback function that indicates whether vector unrolling should be | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: let's place this comment above "using" to have uniform look :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed it |
||
/// attempted on the operation. | ||
using FilterConstraintFnType = std::function<LogicalResult(Operation *op)>; | ||
FilterConstraintFnType filterConstraint = nullptr; | ||
UnrollOptions &setFilterConstraint(FilterConstraintFnType constraint) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there an example/test that demonstrates usage of this option? |
||
filterConstraint = std::move(constraint); | ||
return *this; | ||
} | ||
|
||
/// Function that computes the target shape for unrolling. It returns an | ||
/// optional vector of integers representing the shape. If it returns | ||
/// `std::nullopt`, unrolling is aborted for the given operation. | ||
using NativeShapeFnType = | ||
std::function<std::optional<SmallVector<int64_t>>(Operation *op)>; | ||
NativeShapeFnType nativeShape = nullptr; | ||
UnrollOptions &setNativeShapeFn(NativeShapeFnType fn) { | ||
nativeShape = std::move(fn); | ||
return *this; | ||
} | ||
|
||
/// Function that converts a ShapedType (TensorDescType or VectorType) | ||
/// into the unrolled type based on the tileShape. It returns a vector of | ||
/// types representing the unrolled types for simplicity. | ||
using UnrolledTypeFnType = std::function<SmallVector<Type>( | ||
ShapedType type, ArrayRef<int64_t> tileShape)>; | ||
UnrolledTypeFnType getUnrolledTypes = nullptr; | ||
UnrollOptions &setUnrolledTypesFn(UnrolledTypeFnType fn) { | ||
getUnrolledTypes = std::move(fn); | ||
return *this; | ||
} | ||
}; | ||
|
||
/// Appends patterns for folding aliasing ops into XeGPU ops into `patterns`. | ||
void populateXeGPUFoldAliasOpsPatterns(RewritePatternSet &patterns); | ||
|
||
/// Appends patterns for XeGPU SIMT distribution into `patterns`. | ||
void populateXeGPUSubgroupDistributePatterns(RewritePatternSet &patterns); | ||
|
||
/// Collect a set of patterns to unroll xegpu operations to a smaller shapes. | ||
/// Users can control whether an operation to be unrolled or not, as well as | ||
/// its target shape via `options` structure. (via setting filterConstraint | ||
/// and nativeShape respectively, both of them are function refs taking `op` as | ||
/// input). | ||
/// An `op` is unrolled to the `targetShape` as follows, for each of its | ||
/// operands: | ||
/// 1. the unrolled type `unrolledType` and number of unrolled instances | ||
/// `numUnrolledInstances` are computed from the `targetShape`. | ||
/// 2. pack each operand. ExtractStridedSlice are created to break-up the | ||
/// vector operands. And BuiltinUnrealizedCastop are created to break-up | ||
/// the TensorDesc operands. | ||
/// 3. the original op is cloned `numUnrolledInstances` times, once for each | ||
/// result. | ||
/// 4. unpack the results. InsertStridedSlice are inserted for VectorType | ||
/// result, and BuiltinUnrealizedCastOp are inserted for TensorDescType result | ||
/// to re-assemble the slices into the original shape. | ||
void populateXeGPUUnrollPatterns(RewritePatternSet &patterns, | ||
const UnrollOptions &options); | ||
|
||
} // namespace xegpu | ||
} // namespace mlir | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.