Skip to content

[mlir][vector][linalg] Fix incorrect in_bounds attribute calculation #142107

Open
@banach-space

Description

@banach-space

This ticket is about the in_bounds attribute from the Vectors vector.transfer_read + vector.transfer_write Ops.

There are at least two places in the Linalg vectoriser where the computation of this attribute is too weak:

  • Link to a code block in vectorizeAsInsertSliceOp.
  • Link to a code block in createWriteOrMaskedWrite.

Note that in both cases, the write/read indices are ignored when computing the attribute (you will have to open the links). This means that the following example is always inferred as "in bounds". However, that should depend on the indices:

vector.transfer_write %14, %arg8[%arg3, %arg5] {in_bounds = [true, true]} : vector<8x4xf32>, memref<8x32xf32>

Effectively, vectorizeAsInsertSliceOp and createWriteOrMaskedWrite assume that the indices are always 0, but that is not always the case (e.g. in the example above).

In #141244 I am refactoring vectorizeAsInsertSliceOp to re-use createWriteOrMaskedWrite - this way there will be only one hook to fix. Note that this issue also affects createReadOrMaskedRead, which is implemented in "VectorUtils.cpp" (the other hooks are implemented in "Vectorization.cpp").

Also in #141244, I am replacing this calculation:

    for (unsigned i = 0; i < rank; i++)
      inBoundsVal[i] = (destShape[i] == inputVecSizesForLeadingDims[i]) &&
                       !ShapedType::isDynamic(destShape[i]);

with (== --> >=):

    for (unsigned i = 0; i < rank; i++)
      inBoundsVal[i] = (destShape[i] >= inputVecSizesForLeadingDims[i]) &&
                       !ShapedType::isDynamic(destShape[i]);

If the write/read indices are 0 (and that's the only situation in which the computation above is valid), then both == and >= are valid (i.e there is not need to require eq, geq is also fine).

Addressing this might be tricky - switching from "in-bounds" to "out-of-bounds" will mean that vector.maskedload will be generated instead of vector.load in many places.

NEXT STEPS:

  • Land [mlir][linalg] Refactor vectorization hooks to improve code reuse #141244 - it maintains the current behaviour and will simplify the next steps.
  • Investigate the impact of using "out-of-bounds" instead of "in-bounds".
  • Look into value-bounds-analysis as a tool to infer that the indices are indeed 0, even when using SSA values (that will often be the case in practice).

As a specific example, this was generated from ransform-vector.mlir:

    %0 = scf.for %arg3 = %c0 to %c8 step %c8 iter_args(%arg4 = %arg2) -> (tensor<8x32xf32>) {
      %1 = scf.for %arg5 = %c0 to %c32 step %c4 iter_args(%arg6 = %arg4) -> (tensor<8x32xf32>) {
        %2 = scf.for %arg7 = %c0 to %c16 step %c2 iter_args(%arg8 = %arg6) -> (tensor<8x32xf32>) {
          %3 = vector.transfer_read %arg0[%arg3, %arg7], %cst {in_bounds = [true, true]} : tensor<8x16xf32>, vector<8x2xf32>
          %4 = vector.transfer_read %arg1[%arg7, %arg5], %cst {in_bounds = [true, true]} : tensor<16x32xf32>, vector<2x4xf32>
          %5 = vector.transfer_read %arg8[%arg3, %arg5], %cst {in_bounds = [true, true]} : tensor<8x32xf32>, vector<8x4xf32>
          %6 = vector.contract {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %3, %4, %5 : vector<8x2xf32>, vector<2x4xf32> into vector<8x4xf32>
          %7 = vector.transfer_write %6, %arg8[%arg3, %arg5] {in_bounds = [false, false]} : vector<8x4xf32>, tensor<8x32xf32>
          scf.yield %7 : tensor<8x32xf32>
        }
        scf.yield %2 : tensor<8x32xf32>
      }
      scf.yield %1 : tensor<8x32xf32>
    }
    return %0 : tensor<8x32xf32>
  }

It's quite clear that the range of indices for this vector.transfer_write make it always "in-bounds":

%7 = vector.transfer_write %6, %arg8[%arg3, %arg5] {in_bounds = [false, false]} : vector<8x4xf32>, tensor<8x32xf32>

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions