Description
git version: d6c0839
system: Ubuntu 18.04.6 LTS
Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without --affine-data-copy-generate="generate-dma=false fast-mem-space=0 skip-non-unit-stride-loops"
.
Steps to Reproduce:
1. MLIR Program (a.mlir):
a.mlir:
module {
func.func private @printMemrefI32(tensor<*xi32>)
func.func private @printMemrefF32(tensor<*xf32>)
func.func @entry(%arg0: index) -> ( tensor<1x3x5xi32>) {
%37 = "tosa.const"() <{value = dense<0> : tensor<1x3x5xi32>}> : () -> tensor<1x3x5xi32>
%38 = tosa.while_loop (%arg1 = %37) : (tensor<1x3x5xi32>) -> tensor<1x3x5xi32> {
%96 = "tosa.const"() <{value = dense<3> : tensor<1x3x5xi32>}> : () -> tensor<1x3x5xi32>
%97 = tosa.greater %96, %arg1 : (tensor<1x3x5xi32>, tensor<1x3x5xi32>) -> tensor<1x3x5xi1>
%extracted_8 = tensor.extract %97[%arg0, %arg0, %arg0] : tensor<1x3x5xi1>
%from_elements_9 = tensor.from_elements %extracted_8 : tensor<i1>
tosa.yield %from_elements_9 : tensor<i1>
} do {
^bb0(%arg1: tensor<1x3x5xi32>):
%101 = "tosa.const"() <{value = dense<1> : tensor<1x3x5xi32>}> : () -> tensor<1x3x5xi32>
%102 = tosa.add %arg1, %101 : (tensor<1x3x5xi32>, tensor<1x3x5xi32>) -> tensor<1x3x5xi32>
tosa.yield %102 : tensor<1x3x5xi32>
}
return %38 : tensor<1x3x5xi32>
}
func.func @main() {
%idx0 = index.constant 0
%0 = call @entry(%idx0) : (index) -> ( tensor<1x3x5xi32>)
%cast_0 = tensor.cast %0 : tensor<1x3x5xi32> to tensor<*xi32>
call @printMemrefI32(%cast_0) : (tensor<*xi32>) -> ()
return
}
}
2. Command to Run without --affine-data-copy-generate
:
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-arith -tosa-to-tensor -convert-scf-to-cf --test-math-to-vcix \
-one-shot-bufferize="bufferize-function-boundaries" --expand-strided-metadata -convert-linalg-to-affine-loops -lower-affine \
-convert-scf-to-cf -finalize-memref-to-llvm -convert-arith-to-llvm -convert-func-to-llvm -convert-cf-to-llvm -convert-index-to-llvm \
-reconcile-unrealized-casts | timeout 100 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main \
-entry-point-result=void --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
3. Output without --affine-data-copy-generate
::
[[[3],
[3]]]
4. Command to Run with --affine-data-copy-generate
:
/data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt /data/szy/workspace/mlir-inconsistent/a.mlir -tosa-to-scf \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-opt -tosa-to-arith -tosa-to-tensor -convert-scf-to-cf --test-math-to-vcix \
-one-shot-bufferize="bufferize-function-boundaries" --expand-strided-metadata -convert-linalg-to-affine-loops \
--affine-data-copy-generate="generate-dma=false fast-mem-space=0 skip-non-unit-stride-loops" -lower-affine -convert-scf-to-cf \
-finalize-memref-to-llvm -convert-arith-to-llvm -convert-func-to-llvm -convert-cf-to-llvm -convert-index-to-llvm \
-reconcile-unrealized-casts | timeout 100 /data/szy/MLIR/llvm-release/llvm-project/build/bin/mlir-cpu-runner -e main \
-entry-point-result=void --shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_c_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_runner_utils.so \
--shared-libs=/data/szy/MLIR/llvm-release/llvm-project/build/lib/libmlir_async_runtime.so
5. Output with --affine-data-copy-generate
:
[[[109110929],
[21903]]]
6. Analysis for this case :
This MLIR program is expected to correctly output [[3],[3]]
because the loop terminates when the condition greater than 3 is no longer met. The loop variable %arg1
is initialized to 0 and incremented by 1 in each iteration, ensuring that it reaches 3 before exiting. However, after applying --affine-data-copy-generate
, the program incorrectly outputs a random value instead.
I debug this issue and find the faulty pass is --affine-data-copy-generate
pass
The input IR (ir before running the --affine-data-copy-generate
pass) can be found in input.txt
The output IR (ir after running the --affine-data-copy-generate pass) can be found in output.txt
Please change file from .txt to .mlir
As shown in the first image, the loop variable %1 is incremented by 1 in line 32-33. However, after running --affine-data-copy-generate
, as shown in the second image, it mistakenly load %alloc_16 (in line 60) that has been deallocated in line 53, leading to a random value result.
%alloc_16
should be %1.