Skip to content

fix(mlir): loosen restrictions on folding dynamic reshapes #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AGindinson
Copy link

@AGindinson AGindinson commented Apr 28, 2025

The main idea behind the change is to allow expand-of-collapse folds for reshapes like ?x?xk -> ? (k>1). The rationale here is that the expand op must have a coherent index/affine expression specified in its output_shape argument (see example below), and if it doesn't, the IR has already been invalidated at an earlier stage:

%c32 = arith.constant 32 : index
%div = arith.divsi %<some_index>, %c32 : index
%collapsed = tensor.collapse_shape %41#1 [[0], [1, 2], [3, 4]]
	         : tensor<9x?x32x?x32xf32> into tensor<9x?x?xf32>
%affine = affine.apply affine_map<()[s0] -> (s0 * 32)> ()[%div]
%expanded = tensor.expand_shape %collapsed [[0], [1, 2], [3]] output_shape [9, %div, 32, %affine]
		: tensor<9x?x?xf32> into tensor<9x?x32x?xf32>

On the above assumption, adjust the routine in
getReassociationIndicesForCollapse() to allow dynamic reshapes beyond ?x...x?x1x...x1 -> ?.

Moreover, the reassociation util was refactored to clearly distinguish between dynamic and static subshapes. A few known caveats were noted as a comment; I don't think it's possible to fold all qualifying dynamic shape patterns in a deterministic way without looking into affine expressions simultaneously (which would be difficult to maintain in a single general utility for all reliant passes, and would therefore require a larger refactor).

@AGindinson AGindinson force-pushed the artem/expand-of-collapse branch 2 times, most recently from b7f9d88 to 21ff423 Compare April 29, 2025 11:27
Copy link
Author

AGindinson commented Apr 29, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@AGindinson AGindinson marked this pull request as ready for review April 29, 2025 11:29
@AGindinson AGindinson requested a review from a team April 29, 2025 11:29
The main idea behind the change is to allow expand-of-collapse folds
for reshapes like `?x?xk` -> `?` (k>1). The rationale here is that the
expand op must have a coherent index/affine expression specified in its
`output_shape` argument (see example below), and if it doesn't, the IR
has already been invalidated at an earlier stage:
```
%c32 = arith.constant 32 : index
%div = arith.divsi %<some_index>, %c32 : index
%collapsed = tensor.collapse_shape %41#1 [[0], [1, 2], [3, 4]]
	         : tensor<9x?x32x?x32xf32> into tensor<9x?x?xf32>
%affine = affine.apply affine_map<()[s0] -> (s0 * 32)> ()[%div]
%expanded = tensor.expand_shape %collapsed [[0], [1, 2], [3]] output_shape [9, %div, 32, %affine]
		: tensor<9x?x?xf32> into tensor<9x?x32x?xf32>
```

On the above assumption, adjust the routine in
`getReassociationIndicesForCollapse()` to allow dynamic reshapes
beyond just `?x..?x1x1x..x1` -> `?`.

Moreover, the reassociation util was refactored to clearly distinguish
between dynamic and static subshapes. A few known caveats were noted as
a comment; I don't think it's possible to fold all qualifying dynamic
shape patterns in a deterministic way without looking into affine
expressions simultaneously (which would be difficult to maintain in a
single general utility for all reliant passes, and would therefore
require a larger refactor).
@AGindinson AGindinson requested a review from maxbartel April 30, 2025 08:50
Copy link

@maxbartel maxbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my limited view the algorithm seems correct. Let's open an upstream PR for this!

Copy link

The placement of the file is super weird btw. I guess Tensor and Memref are using this and for historic reasons it is in this folder. Nowadays people would implement an interface I guess

Copy link
Author

Yeah, all the cases that I've debugged through were pretty much Tensor-specific. Even if it's significant for Memref, I'd have an argument or two for making implementations dialect- or even pattern-specific, looking at how ComposeCollapseOfExpand moved away to using very minimized internal logic. Curious what the community thinks of it.

@AGindinson AGindinson merged commit c79c852 into integrate-llvm-project-20250403 Apr 30, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants