Skip to content

[mlir][xegpu] SIMT distribution patterns for XeGPU CreateNdTdesc, LoadNd, StoreNd and Dpas Ops. #135271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 101 commits into from
Apr 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
39dcf9d
save work
charithaintc Mar 18, 2025
2058773
moving all ops to region working
charithaintc Mar 20, 2025
14233fa
moving all ops to region working
charithaintc Mar 20, 2025
f599873
save work
charithaintc Mar 20, 2025
220ed1f
save work
charithaintc Mar 21, 2025
2a8070f
save work
charithaintc Mar 21, 2025
4838b52
extend sg_map from subgroup to workgroup
chencha3 Mar 21, 2025
cb26979
format code
chencha3 Mar 21, 2025
273fc40
remove changes to prefetch op
chencha3 Mar 21, 2025
504d274
refine the doc for TensorDesc
chencha3 Mar 21, 2025
90e0704
save work
charithaintc Mar 21, 2025
3abe7cb
save work
charithaintc Mar 21, 2025
7c87319
Merge branch 'main' into xegpu_simt_dist
charithaintc Mar 21, 2025
596c953
update doc
chencha3 Mar 21, 2025
2065764
save work
charithaintc Mar 21, 2025
899439b
refine docs
chencha3 Mar 24, 2025
8636d15
refine docs
chencha3 Mar 24, 2025
0190418
refine util
chencha3 Mar 24, 2025
32f9272
refine convert_layout docs
chencha3 Mar 24, 2025
fe11c79
save work
charithaintc Mar 24, 2025
6e1ef3e
save work
charithaintc Mar 24, 2025
55c272c
save work
charithaintc Mar 25, 2025
ee56a3e
Merge branch 'gpu_dialect_changes' into xegpu_simt_dist
charithaintc Mar 25, 2025
1ffe5c8
save work
charithaintc Mar 26, 2025
e5521f9
save work before merging with Chao's PR
charithaintc Mar 27, 2025
350b581
Merge branch 'users/chencha3/xegpu/extend_sg_map' into xegpu_simt_dist
charithaintc Mar 27, 2025
5700c81
merge xegpu changes
charithaintc Mar 29, 2025
1619fcf
Merge branch 'main' into xegpu_simt_dist
charithaintc Mar 31, 2025
2334a97
refactor names
charithaintc Mar 31, 2025
9bddeb6
drop ScopeAttr and refine 1D layout support
chencha3 Apr 1, 2025
784ab38
refine isEvenDistributed
chencha3 Apr 1, 2025
28cf69e
format code
chencha3 Apr 1, 2025
930f1ab
Merge branch 'main' into extend_sg_map
chencha3 Apr 1, 2025
9ed0f87
fix format issue
chencha3 Apr 1, 2025
3b389bf
add 1D layout examples
chencha3 Apr 1, 2025
589d217
refactor names
charithaintc Apr 2, 2025
8b647c4
Merge branch 'users/chencha3/xegpu/extend_sg_map' into xegpu_simt_dist
charithaintc Apr 2, 2025
c6ccef2
refactor
charithaintc Apr 2, 2025
cbd0af0
refine LayoutAttr verifier
chencha3 Apr 4, 2025
3fb4fd4
add unit test
chencha3 Apr 4, 2025
77fdfef
remove dump file
chencha3 Apr 4, 2025
2751332
fix typo
chencha3 Apr 4, 2025
2a16d11
Merge branch 'main' into extend_sg_map
chencha3 Apr 4, 2025
d281a14
fix an error after mering with main
chencha3 Apr 4, 2025
fb28ce8
new line at the end of file
chencha3 Apr 7, 2025
f464662
update doc
chencha3 Apr 8, 2025
eea3c35
Merge branch 'main' into extend_sg_map
chencha3 Apr 8, 2025
7acc56d
Merge branch 'users/chencha3/xegpu/extend_sg_map' into xegpu_simt_dist
charithaintc Apr 8, 2025
270b498
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 9, 2025
2a1d373
Switch to 1D representation for SIMT
chencha3 Apr 10, 2025
2159119
refine verfier for load_nd and store_nd
chencha3 Apr 10, 2025
21f50c0
fix issues
charithaintc Apr 10, 2025
35f9cbe
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 10, 2025
c81b2e0
fix issues
charithaintc Apr 10, 2025
03bfe08
Merge branch 'users/chencha3/xegpu/xegpu_simt_2d_to_1d' into xegpu_si…
charithaintc Apr 11, 2025
2f2ec10
fix issues
charithaintc Apr 14, 2025
2ae3543
fix issues
charithaintc Apr 14, 2025
4c63916
fix issues
charithaintc Apr 14, 2025
2d9cfa3
fix build issue
charithaintc Apr 15, 2025
775d039
refine verifier for gather/scatter
chencha3 Apr 15, 2025
5520ce1
update comments
chencha3 Apr 15, 2025
6abc12a
fix tests
charithaintc Apr 15, 2025
379e186
fix
charithaintc Apr 16, 2025
aa7dbe1
fix
charithaintc Apr 16, 2025
dce6d2a
Merge branch 'users/chencha3/xegpu/xegpu_simt_2d_to_1d' into xegpu_si…
charithaintc Apr 16, 2025
ca5c7e9
fix comments
charithaintc Apr 16, 2025
ed3119c
fix comments
charithaintc Apr 16, 2025
c898de6
fix comments
charithaintc Apr 17, 2025
55be710
fix comments
charithaintc Apr 17, 2025
6e8888a
fix
charithaintc Apr 18, 2025
6ae7aa0
fix
charithaintc Apr 18, 2025
2896b34
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 18, 2025
68b1750
fix
charithaintc Apr 18, 2025
9391696
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 22, 2025
b3e6dc5
save work
charithaintc Apr 22, 2025
5c1c908
save work
charithaintc Apr 23, 2025
7ad625d
save work
charithaintc Apr 23, 2025
d6c0722
save work
charithaintc Apr 24, 2025
d879f8c
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 24, 2025
5f0d164
save work
charithaintc Apr 24, 2025
ac0d93e
save work
charithaintc Apr 24, 2025
90543a0
add missing files
charithaintc Apr 24, 2025
a72ec25
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 28, 2025
bf9c0ab
save work
charithaintc Apr 28, 2025
9b28449
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 29, 2025
1464adb
address comments
charithaintc Apr 29, 2025
14468b5
address comments
charithaintc Apr 29, 2025
b84c2f9
address comments
charithaintc Apr 29, 2025
bff1f5e
address comments
charithaintc Apr 29, 2025
36206bb
address comments
charithaintc Apr 29, 2025
0328684
save work
charithaintc Apr 30, 2025
466712f
save work
charithaintc Apr 30, 2025
0baef66
save work
charithaintc Apr 30, 2025
cadc078
save work
charithaintc Apr 30, 2025
24635e0
add missing lib
charithaintc Apr 30, 2025
bed781b
add missing lib
charithaintc Apr 30, 2025
afdf394
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 30, 2025
84afb20
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 30, 2025
7e0d753
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 30, 2025
d4abd69
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 30, 2025
3722dec
Merge branch 'main' into xegpu_simt_dist
charithaintc Apr 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
Original file line number Diff line number Diff line change
Expand Up @@ -189,11 +189,6 @@ def XeGPU_TensorDesc: XeGPUTypeDef<"TensorDesc", "tensor_desc",
return scatter_attr.getChunkSize().getInt();
return 1;
}

// This returns a vector type that represents the fragment of data owned by
// a work item in SIMT mode if this tensor descriptor is used in a XeGPU
// load/store operation.
FailureOr<VectorType> getDistributedVectorType();
}];

let hasCustomAssemblyFormat = true;
Expand Down
2 changes: 2 additions & 0 deletions mlir/include/mlir/Dialect/XeGPU/Transforms/Transforms.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ namespace xegpu {

/// Appends patterns for folding aliasing ops into XeGPU ops into `patterns`.
void populateXeGPUFoldAliasOpsPatterns(RewritePatternSet &patterns);
/// Appends patterns for XeGPU SIMT distribution into `patterns`.
void populateXeGPUSubgroupDistributePatterns(RewritePatternSet &patterns);

} // namespace xegpu
} // namespace mlir
Expand Down
57 changes: 57 additions & 0 deletions mlir/include/mlir/Dialect/XeGPU/Utils/XeGPUUtils.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
//===- XeGPUUtils.h - Vector Utilities --------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef MLIR_DIALECT_XEGPU_UTILS_XEGPUUTILS_H_
#define MLIR_DIALECT_XEGPU_UTILS_XEGPUUTILS_H_

#include "mlir/IR/BuiltinTypes.h"
namespace mlir {

class VectorType;
namespace xegpu {
class LayoutAttr;
class TensorDescType;
} // namespace xegpu

namespace xegpu {

/// If tensor descriptor has a layout attribute it is used in SIMT mode.
/// In this mode, the distributed vector shape is determined as follows:
/// Definitions:
/// lane_data_size = lane_data[0] × lane_data[1]
/// subgroup_size = lane_layout[0] × lane_layout[1]
/// distribution_unit_size = subgroup_size × lane_data_size
///
/// Case 1: Regular loads/stores.
/// The following conditions must be met:
/// * tensor_desc[0] == lane_layout[0]
/// Distributed vector is a 1D vector with shape:
/// [chunk_size]
///
/// Case 2: Block loads/stores
/// Additional definitions:
/// tensor_size = tensor_desc[0] * .. * tensor_desc[r-1] * array_length
/// n_distribution_units = tensor_size / distribution_unit_size
/// fragment_size = n_distribution_units * lane_data_size
/// Given above definitions, the following conditions must be met:
/// * tensor_desc[0] % (lane_layout[0] × lane_data[0]) == 0
/// * tensor_desc[1] % (lane_layout[1] × lane_data[1]) == 0
/// Distributed vector is a 1D vector with shape:
/// [fragment_size]
FailureOr<VectorType> getDistributedVectorType(xegpu::TensorDescType tdescTy);

/// Helper to get the distributed vector type for a given vector type according
/// to a given LayoutAttr.
FailureOr<VectorType> getDistributedVectorType(VectorType originalType,
LayoutAttr layout);

} // namespace xegpu

} // namespace mlir

#endif // MLIR_DIALECT_XEGPU_UTILS_XEGPUUTILS_H_
1 change: 1 addition & 0 deletions mlir/lib/Dialect/XeGPU/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
add_subdirectory(IR)
add_subdirectory(Transforms)
add_subdirectory(Utils)
68 changes: 0 additions & 68 deletions mlir/lib/Dialect/XeGPU/IR/XeGPUDialect.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -376,74 +376,6 @@ LogicalResult TensorDescType::verify(
return success();
}

// If tensor descriptor has a layout attribute it is used in SIMT mode.
// In this mode, the distributed vector shape is determined as follows:
// Definitions:
// lane_data_size = lane_data[0] × lane_data[1]
// subgroup_size = lane_layout[0] × lane_layout[1]
// distribution_unit_size = subgroup_size × lane_data_size
// ---------------------------------------------------------------------
// Case 1: Regular loads/stores.
// ---------------------------------------------------------------------
// The following conditions must be met:
// * tensor_desc[0] == lane_layout[0]
// Distributed vector is a 1D vector with shape:
// [chunk_size]
// ---------------------------------------------------------------------
// Case 2: Block loads/stores
// ---------------------------------------------------------------------
// Additional definitions:
// tensor_size = tensor_desc[0] * .. * tensor_desc[r-1] * array_length
// n_distribution_units = tensor_size / distribution_unit_size
// fragment_size = n_distribution_units * lane_data_size
// Given above definitions, the following conditions must be met:
// * tensor_desc[0] % (lane_layout[0] × lane_data[0]) == 0
// * tensor_desc[1] % (lane_layout[1] × lane_data[1]) == 0
// Distributed vector is a 1D vector with shape:
// [fragment_size]
FailureOr<VectorType> TensorDescType::getDistributedVectorType() {
auto layout = llvm::dyn_cast_if_present<LayoutAttr>(getLayout());
// It only works for subgroup level layout, which only has lane_layout
// and lane_data, and is to distribute a SIMD code into SIMT code.
if (!layout || !layout.isSgLayout())
return failure();

SmallVector<int64_t> laneData(layout.getLaneData().asArrayRef());
SmallVector<int64_t> laneLayout(layout.getLaneLayout().asArrayRef());
auto tdescShape = getShape();

// compute sgSize by multiply elements of laneLayout
// e.g. for 2D layout, sgSize = laneLayout[0] * laneLayout[1]
// e.g. for 1D layout, sgSize = laneLayout[0]
auto sgSize = std::accumulate(laneLayout.begin(), laneLayout.end(), 1,
std::multiplies<int64_t>());

// Case 1: regular loads/stores
auto scatterAttr = getEncodingAsScatterTensorDescAttr();
if (scatterAttr) {
auto chunkSize = scatterAttr.getChunkSize().getInt();
// Verify if the first dimension of the tensor descriptor shape is
// distributable.
assert(tdescShape[0] == laneLayout[0] &&
"tensor descriptor shape is not distributable");
return VectorType::get({chunkSize}, getElementType());
}

// Case 2: block loads/stores
// Check if the tensor descriptor shape is distributable.
int64_t tensorSize = 1;
for (auto [tdescDim, laneDim, laneDataDim] :
llvm::zip_equal(tdescShape, laneLayout, laneData)) {
assert((tdescDim % (laneDim * laneDataDim) == 0) &&
"tensor descriptor shape is not distributable");
tensorSize *= tdescDim;
}
// tensorSize must be adjusted for array_length.
tensorSize *= getArrayLength();

return VectorType::get({tensorSize / sgSize}, getElementType());
}

} // namespace xegpu
} // namespace mlir

Expand Down
3 changes: 3 additions & 0 deletions mlir/lib/Dialect/XeGPU/Transforms/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,7 @@ add_mlir_dialect_library(MLIRXeGPUTransforms
MLIRPass
MLIRTransforms
MLIRGPUDialect
MLIRXeGPUUtils
MLIRGPUUtils
MLIRVectorTransforms
)
Loading
Loading