Skip to content

[mlir][vector] Support complete folding in single pass for vector.insert/vector.extract #142124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

yangtetris
Copy link

Description

This patch improves the folding efficiency of vector.insert and vector.extract operations by not returning early after successfully converting dynamic indices to static indices.

Motivation

Since the OpBuilder::createOrFold function only calls fold once, the current fold methods of vector.insert and vector.extract may leave the op in a state that can be folded further. For example, consider the following un-folded IR:

%v1 = vector.insert %e1, %v0 [0] : f32 into vector<128xf32>
%c0 = arith.constant 0 : index
%e2 = vector.extract %v1[%c0] : f32 from vector<128xf32>

If we use createOrFold to create the vector.extract op, then the result will be:

%v1 = vector.insert %e1, %v0 [127] : f32 into vector<128xf32>
%e2 = vector.extract %v1[0] : f32 from vector<128xf32>

But this is not the optimal result. createOrFold should have returned %e1.
The reason is that the execution of fold returns immediately after extractInsertFoldConstantOp, causing subsequent folding logics to be skipped.

…ert/vector.extract

After successfully converting dynamic indices to static indices, continue folding
instead of returning early, allowing subsequent fold operations to be executed.
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented May 30, 2025

@llvm/pr-subscribers-mlir-vector

@llvm/pr-subscribers-mlir

Author: Yang Bai (yangtetris)

Changes

Description

This patch improves the folding efficiency of vector.insert and vector.extract operations by not returning early after successfully converting dynamic indices to static indices.

Motivation

Since the OpBuilder::createOrFold function only calls fold once, the current fold methods of vector.insert and vector.extract may leave the op in a state that can be folded further. For example, consider the following un-folded IR:

%v1 = vector.insert %e1, %v0 [0] : f32 into vector&lt;128xf32&gt;
%c0 = arith.constant 0 : index
%e2 = vector.extract %v1[%c0] : f32 from vector&lt;128xf32&gt;

If we use createOrFold to create the vector.extract op, then the result will be:

%v1 = vector.insert %e1, %v0 [127] : f32 into vector&lt;128xf32&gt;
%e2 = vector.extract %v1[0] : f32 from vector&lt;128xf32&gt;

But this is not the optimal result. createOrFold should have returned %e1.
The reason is that the execution of fold returns immediately after extractInsertFoldConstantOp, causing subsequent folding logics to be skipped.


Full diff: https://github.com/llvm/llvm-project/pull/142124.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+8-6)
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 890a5e9e5c9b4..2e0c917b2139d 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -2062,6 +2062,7 @@ static Value extractInsertFoldConstantOp(OpType op, AdaptorType adaptor,
   if (opChange) {
     op.setStaticPosition(staticPosition);
     op.getOperation()->setOperands(operands);
+    // Return the original result to indicate an in-place folding happened.
     return op.getResult();
   }
   return {};
@@ -2148,8 +2149,8 @@ OpFoldResult ExtractOp::fold(FoldAdaptor adaptor) {
   // Fold `arith.constant` indices into the `vector.extract` operation. Make
   // sure that patterns requiring constant indices are added after this fold.
   SmallVector<Value> operands = {getVector()};
-  if (auto val = extractInsertFoldConstantOp(*this, adaptor, operands))
-    return val;
+  auto inplaceFolded = extractInsertFoldConstantOp(*this, adaptor, operands);
+
   if (auto res = foldPoisonIndexInsertExtractOp(
           getContext(), adaptor.getStaticPosition(), kPoisonIndex))
     return res;
@@ -2171,7 +2172,8 @@ OpFoldResult ExtractOp::fold(FoldAdaptor adaptor) {
     return val;
   if (auto val = foldScalarExtractFromFromElements(*this))
     return val;
-  return OpFoldResult();
+
+  return inplaceFolded;
 }
 
 namespace {
@@ -3150,8 +3152,8 @@ OpFoldResult vector::InsertOp::fold(FoldAdaptor adaptor) {
   // Fold `arith.constant` indices into the `vector.insert` operation. Make
   // sure that patterns requiring constant indices are added after this fold.
   SmallVector<Value> operands = {getValueToStore(), getDest()};
-  if (auto val = extractInsertFoldConstantOp(*this, adaptor, operands))
-    return val;
+  auto inplaceFolded = extractInsertFoldConstantOp(*this, adaptor, operands);
+
   if (auto res = foldPoisonIndexInsertExtractOp(
           getContext(), adaptor.getStaticPosition(), kPoisonIndex))
     return res;
@@ -3161,7 +3163,7 @@ OpFoldResult vector::InsertOp::fold(FoldAdaptor adaptor) {
     return res;
   }
 
-  return {};
+  return inplaceFolded;
 }
 
 //===----------------------------------------------------------------------===//

@yangtetris
Copy link
Author

The existing Dialect/Vector/canonicalize.mlir file doesn't seem to be a good place to add test for this change, as canonicalization performs repeated folding, making it impossible to verify whether the optimization occurs in a single pass. Any suggestions on where/how to test this would be appreciated. Thanks!

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please provide a test for this?

@dcaballe
Copy link
Contributor

Thanks for fixing this problem!

For testing, would it make sense to add a pass similar to --canonicalize that just invokes tryToFold once for every op? That would be very helpful to separate folding testing from canonicalization testing.

@yangtetris
Copy link
Author

Can you please provide a test for this?

I just added two tests in test/Dialect/Vector/constant-fold.mlir. Please take another look.

@yangtetris
Copy link
Author

For testing, would it make sense to add a pass similar to --canonicalize that just invokes tryToFold once for every op? That would be very helpful to separate folding testing from canonicalization testing.

I happened to see an existing pass that implements what you said. Do you think it is a good choice to use TestConstantFold for folding tests?

@dcaballe
Copy link
Contributor

dcaballe commented Jun 3, 2025

I was not aware of that pass. It looks like pretty focus on folding constants only. I'm also surprised that we don't have a pass to test the op folder in isolation. @joker-eph, @River707, @jpienaar, do you know?

@joker-eph
Copy link
Collaborator

I don't see a focus on constant for this pass: it only has some special handling to cleanup constants after it's done as far as I can tell. Otherwise it is a single traversal applying the folder.

Copy link
Contributor

@dcaballe dcaballe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM!

@dcaballe dcaballe requested a review from newling June 5, 2025 17:38
Copy link
Contributor

@newling newling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! LGTM.

I've confirmed that this does as expected by running

mlir-opt input.mlir
mlir-opt input.mlir -test-constant-fold
mlir-opt input.mlir -test-constant-fold -test-constant-fold

before and after.

The 'constant' in the pass and test file name are indeed a bit confusing, but probably not worth the churn of changing at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants