Description
git version: 2172a5e
system: Ubuntu 18.04.6 LTS
Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without --test-loop-fusion="test-loop-fusion-transformation"
.
Steps to Reproduce:
1. MLIR Program (a.mlir):
a.mlir:
module {
func.func private @printMemrefI32(tensor<*xi32>)
func.func private @printMemrefF32(tensor<*xf32>)
func.func @entry(%arg0: index) -> tensor<1x2xi32> {
%11 = "tosa.const"() <{values = dense<0> : tensor<1x2x2xi32>}> : () -> tensor<1x2x2xi32>
%12 = tosa.while_loop (%arg1 = %11) : (tensor<1x2x2xi32>) -> tensor<1x2x2xi32> {
%51 = "tosa.const"() <{values = dense<6> : tensor<1x2x2xi32>}> : () -> tensor<1x2x2xi32>
%52 = tosa.greater %51, %arg1 : (tensor<1x2x2xi32>, tensor<1x2x2xi32>) -> tensor<1x2x2xi1>
%extracted = tensor.extract %52[%arg0, %arg0, %arg0] : tensor<1x2x2xi1>
%from_elements = tensor.from_elements %extracted : tensor<i1>
tosa.yield %from_elements : tensor<i1>
} do {
^bb0(%arg1: tensor<1x2x2xi32>):
%51 = "tosa.const"() <{values = dense<1> : tensor<1x2x2xi32>}> : () -> tensor<1x2x2xi32>
%52 = tosa.add %arg1, %51 : (tensor<1x2x2xi32>, tensor<1x2x2xi32>) -> tensor<1x2x2xi32>
tosa.yield %52 : tensor<1x2x2xi32>
}
%50 = tosa.argmax %12 {axis = 1 : i32} : (tensor<1x2x2xi32>) -> tensor<1x2xi32>
return %50 : tensor<1x2xi32>
}
func.func @main() {
%idx0 = index.constant 0
%0 = call @entry(%idx0) : (index) -> tensor<1x2xi32>
%cast = tensor.cast %0 : tensor<1x2xi32> to tensor<*xi32>
call @printMemrefI32(%cast) : (tensor<*xi32>) -> ()
return
}
}
2. Command to Run without --test-loop-fusion
:
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-arith -convert-scf-to-cf -convert-arith-to-llvm \
-convert-linalg-to-loops -convert-linalg-to-parallel-loops -convert-linalg-to-loops -one-shot-bufferize="bufferize-function-boundaries" \
--expand-strided-metadata -convert-linalg-to-affine-loops -finalize-memref-to-llvm \
-lower-affine -convert-scf-to-cf -convert-cf-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -convert-index-to-llvm -convert-arith-to-llvm \
-reconcile-unrealized-casts | timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
3. Output without --test-loop-fusion
::
[[0, 0]]
4. Command to Run with --test-loop-fusion
:
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-arith -convert-scf-to-cf -convert-arith-to-llvm \
-convert-linalg-to-loops -convert-linalg-to-parallel-loops -convert-linalg-to-loops -one-shot-bufferize="bufferize-function-boundaries" \
--expand-strided-metadata -convert-linalg-to-affine-loops -finalize-memref-to-llvm --test-loop-fusion="test-loop-fusion-transformation" \
-lower-affine -convert-scf-to-cf -convert-cf-to-llvm -finalize-memref-to-llvm -convert-func-to-llvm -convert-index-to-llvm -convert-arith-to-llvm \
-reconcile-unrealized-casts | timeout 10 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
5. Output with --test-loop-fusion
:
[[1, 1]]
6. Analysis for this case :
I debug this issue and find the faulty pass is --test-loop-fusion
The input IR (ir before running the --test-loop-fusion
) can be found in input.txt
The output IR (ir after running the ---test-loop-fusion
) can be found in output.txt
Please change file from .txt to .mlir
As some developers have discussed in issue #118631, the test-pass should also preserve the semantics of MLIR.
This MLIR program is expected to correctly output [0, 0] for %50 = tosa.argmax %12
, given that all elements in %12
are equal. However, instead of the expected result, it incorrectly outputs [1, 1], which corresponds to the last index of %12
.
To debug this issue, I printed the IR after each pass and found that the input IR is correct before applying the --test-loop-fusion="test-loop-fusion-transformation"
pass. As shown in the first image, %173 is initialized before the three-level computation (lines 221–237). However, after running --test-loop-fusion="test-loop-fusion-transformation"
, %173 is instead initialized inside the innermost loop, which causes it to overwrite the assigned value at line 253 in the second image.
Additionally, I try the affine-loop-fusion
pass on the same input IR. As shown in the third image, %173 should be initialized in the second loop, leading to the correct result.