-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[Matrix] Use shape info for StoreInst directly. #142664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ShapeInfo for the store operand may be dropped, e.g. because the operand got folded by transpose optimizations to another instruction w/o shape info. This was exposed by the assertion added in llvm#142416. This updates VisitStore to use the shape-info directly from the instruction, which is in line with the other Visit* functions and ensures that we won't lose shape info.
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesShapeInfo for the store operand may be dropped, e.g. because the operand got folded by transpose optimizations to another instruction w/o shape info. This was exposed by the assertion added in This updates VisitStore to use the shape-info directly from the instruction, which is in line with the other Visit* functions and ensures that we won't lose shape info. Full diff: https://github.com/llvm/llvm-project/pull/142664.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp b/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
index fb5e081acf7c5..1c3d22924795b 100644
--- a/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
+++ b/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
@@ -2117,7 +2117,7 @@ class LowerMatrixIntrinsics {
bool VisitStore(StoreInst *Inst, Value *StoredVal, Value *Ptr,
IRBuilder<> &Builder) {
- auto I = ShapeMap.find(StoredVal);
+ auto I = ShapeMap.find(Inst);
assert(I != ShapeMap.end() &&
"must only visit instructions with shape info");
LowerStore(Inst, StoredVal, Ptr, Inst->getAlign(),
diff --git a/llvm/test/Transforms/LowerMatrixIntrinsics/transpose-fold-store.ll b/llvm/test/Transforms/LowerMatrixIntrinsics/transpose-fold-store.ll
new file mode 100644
index 0000000000000..3a01c516f4b70
--- /dev/null
+++ b/llvm/test/Transforms/LowerMatrixIntrinsics/transpose-fold-store.ll
@@ -0,0 +1,45 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -p lower-matrix-intrinsics -S %s | FileCheck %s
+
+define void @redundant_transpose_of_shuffle(<4 x float> %m, ptr %dst) {
+; CHECK-LABEL: define void @redundant_transpose_of_shuffle(
+; CHECK-SAME: <4 x float> [[M:%.*]], ptr [[DST:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHUFFLE:%.*]] = shufflevector <4 x float> [[M]], <4 x float> zeroinitializer, <4 x i32> zeroinitializer
+; CHECK-NEXT: [[SPLIT:%.*]] = shufflevector <4 x float> [[SHUFFLE]], <4 x float> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT: store <4 x float> [[SPLIT]], ptr [[DST]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %shuffle = shufflevector <4 x float> %m, <4 x float> zeroinitializer, <4 x i32> zeroinitializer
+ %t = tail call <4 x float> @llvm.matrix.transpose.v3f32(<4 x float> %shuffle, i32 1, i32 4)
+ store <4 x float> %t, ptr %dst, align 4
+ ret void
+}
+
+define void @redundant_transpose_of_shuffle_2(<4 x float> %m, ptr %dst) {
+; CHECK-LABEL: define void @redundant_transpose_of_shuffle_2(
+; CHECK-SAME: <4 x float> [[M:%.*]], ptr [[DST:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SHUFFLE:%.*]] = shufflevector <4 x float> [[M]], <4 x float> zeroinitializer, <4 x i32> zeroinitializer
+; CHECK-NEXT: [[SPLIT:%.*]] = shufflevector <4 x float> [[SHUFFLE]], <4 x float> poison, <1 x i32> zeroinitializer
+; CHECK-NEXT: [[SPLIT1:%.*]] = shufflevector <4 x float> [[SHUFFLE]], <4 x float> poison, <1 x i32> <i32 1>
+; CHECK-NEXT: [[SPLIT2:%.*]] = shufflevector <4 x float> [[SHUFFLE]], <4 x float> poison, <1 x i32> <i32 2>
+; CHECK-NEXT: [[SPLIT3:%.*]] = shufflevector <4 x float> [[SHUFFLE]], <4 x float> poison, <1 x i32> <i32 3>
+; CHECK-NEXT: store <1 x float> [[SPLIT]], ptr [[DST]], align 4
+; CHECK-NEXT: [[VEC_GEP:%.*]] = getelementptr float, ptr [[DST]], i64 1
+; CHECK-NEXT: store <1 x float> [[SPLIT1]], ptr [[VEC_GEP]], align 4
+; CHECK-NEXT: [[VEC_GEP4:%.*]] = getelementptr float, ptr [[DST]], i64 2
+; CHECK-NEXT: store <1 x float> [[SPLIT2]], ptr [[VEC_GEP4]], align 4
+; CHECK-NEXT: [[VEC_GEP5:%.*]] = getelementptr float, ptr [[DST]], i64 3
+; CHECK-NEXT: store <1 x float> [[SPLIT3]], ptr [[VEC_GEP5]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %shuffle = shufflevector <4 x float> %m, <4 x float> zeroinitializer, <4 x i32> zeroinitializer
+ %t = tail call <4 x float> @llvm.matrix.transpose.v3f32(<4 x float> %shuffle, i32 4, i32 1)
+ store <4 x float> %t, ptr %dst, align 4
+ ret void
+}
+
+declare <4 x float> @llvm.matrix.transpose.v3f32(<4 x float>, i32, i32)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good find!
ShapeInfo for the store operand may be dropped, e.g. because the operand got folded by transpose optimizations to another instruction w/o shape info. This was exposed by the assertion added in llvm/llvm-project#142416. This updates VisitStore to use the shape-info directly from the instruction, which is in line with the other Visit* functions and ensures that we won't lose shape info. PR: llvm/llvm-project#142664
ShapeInfo for the store operand may be dropped, e.g. because the operand got folded by transpose optimizations to another instruction w/o shape info. This was exposed by the assertion added in
#142416.
This updates VisitStore to use the shape-info directly from the instruction, which is in line with the other Visit* functions and ensures that we won't lose shape info.