Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tosa] : Fix integer overflow for computing intmax+1 in tosa.cast to linalg. #112455

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sahas3
Copy link
Contributor

@sahas3 sahas3 commented Oct 16, 2024

This PR fixes an issue related to integer overflow when computing (intmax+1) for i64 during tosa-to-linalg pass for tosa.cast.

Found this issue while debugging a numerical mismatch for deeplabv3 model from torchvision represented in tosa dialect using the TorchToTosa pipeline in torch-mlir repository. torch.aten.to.dtype is converted to tosa.cast that casts f32 to i64 type. Technically by the specification, tosa.cast doesn't handle casting f32 to i64. So it's possible to add a verifier to error out for such tosa ops instead of producing incorrect code. However, I chose to fix the overflow issue to still be able to represent the deeplabv3 model with tosa ops in the above-mentioned pipeline. Open to suggestions if adding the verifier is more appropriate instead.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-mlir-linalg
@llvm/pr-subscribers-mlir-tosa

@llvm/pr-subscribers-mlir

Author: Sayan Saha (sahas3)

Changes

This PR fixes an issue related to integer overflow when computing (intmax+1) for i64 during tosa-to-linalg pass for tosa.cast.

Found this issue while debugging a numerical mismatch for deeplabv3 model from torchvision represented in tosa dialect using the TorchToTosa pipeline in torch-mlir repository. torch.aten.to.dtype is converted to tosa.cast that casts f32 to i64 type. Technically by the specification, tosa.cast doesn't handle casting f32 to i64. So it's possible to add a verifier to error out for such tosa ops instead of producing incorrect code. However, I chose to fix the overflow issue to still be able to represent the deeplabv3 model with tosa ops in the above-mentioned pipeline. Open to suggestions if adding the verifier is more appropriate instead.


Full diff: https://github.com/llvm/llvm-project/pull/112455.diff

2 Files Affected:

  • (modified) mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp (+1-1)
  • (modified) mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir (+27)
diff --git a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
index c88f4db27c304e..e6b3e4b677e4f2 100644
--- a/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
+++ b/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
@@ -563,7 +563,7 @@ static Value createLinalgBodyCalculationForElementwiseOp(
                    getElementTypeOrSelf(srcTy),
                    APInt::getSignedMaxValue(dstTy.getIntOrFloatBitWidth())
                            .getSExtValue() +
-                       1));
+                       1.0));
 
       auto intMax = rewriter.create<arith::ConstantOp>(
           loc, rewriter.getIntegerAttr(
diff --git a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
index f9d37f9427d4f4..7e2ec67d38d378 100644
--- a/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
+++ b/mlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
@@ -1929,3 +1929,30 @@ func.func @test_dynamic_fft2d(%arg0: tensor<?x?x?xf32>, %arg1: tensor<?x?x?xf32>
   %output_real, %output_imag = "tosa.fft2d"(%arg0, %arg1) {inverse = true} : (tensor<?x?x?xf32>, tensor<?x?x?xf32>) -> (tensor<?x?x?xf32>, tensor<?x?x?xf32>)
   return %output_real, %output_imag : tensor<?x?x?xf32>, tensor<?x?x?xf32>
 }
+
+
+// -----
+
+// CHECK: #[[$MAP0:.+]] = affine_map<(d0) -> (0)>
+// CHECK: #[[$MAP1:.+]] = affine_map<(d0) -> (d0)>
+
+// CHECK-LABEL:   func.func @test_cast_fp32_i64(
+// CHECK-SAME:                                  %[[ARG0:.*]]: tensor<1xf32>) -> tensor<1xi64> {
+// CHECK:           %[[VAL_0:.*]] = tensor.empty() : tensor<1xi64>
+// CHECK:           %[[RESULT:.*]] = linalg.generic {indexing_maps = [#[[$MAP0]], #[[$MAP1]]], iterator_types = ["parallel"]} ins(%[[ARG0]] : tensor<1xf32>) outs(%[[VAL_0]] : tensor<1xi64>) {
+// CHECK:           ^bb0(%[[VAL_2:.*]]: f32, %[[VAL_3:.*]]: i64):
+// CHECK:             %[[VAL_4:.*]] = math.roundeven %[[VAL_2]] : f32
+// CHECK:             %[[VAL_5:.*]] = arith.constant -9.22337203E+18 : f32
+// CHECK:             %[[VAL_6:.*]] = arith.constant 9.22337203E+18 : f32
+// CHECK:             %[[VAL_7:.*]] = arith.constant 9223372036854775807 : i64
+// CHECK:             %[[VAL_8:.*]] = arith.maximumf %[[VAL_4]], %[[VAL_5]] : f32
+// CHECK:             %[[VAL_9:.*]] = arith.fptosi %[[VAL_8]] : f32 to i64
+// CHECK:             %[[VAL_10:.*]] = arith.cmpf uge, %[[VAL_4]], %[[VAL_6]] : f32
+// CHECK:             %[[VAL_11:.*]] = arith.select %[[VAL_10]], %[[VAL_7]], %[[VAL_9]] : i64
+// CHECK:             linalg.yield %[[VAL_11]] : i64
+// CHECK:           } -> tensor<1xi64>
+// CHECK:           return %[[RESULT]] : tensor<1xi64>
+func.func @test_cast_fp32_i64(%arg0: tensor<1xf32>) -> (tensor<1xi64>) {
+  %0 = tosa.cast %arg0 : (tensor<1xf32>) -> tensor<1xi64>
+  return %0: tensor<1xi64>
+}

@sahas3
Copy link
Contributor Author

sahas3 commented Oct 16, 2024

Without the fix the IR after tosa-to-linalg for

func.func @test_cast_fp32_i64(%arg0: tensor<1xf32>) -> (tensor<1xi64>) {
  %0 = tosa.cast %arg0 : (tensor<1xf32>) -> tensor<1xi64>
  return %0: tensor<1xi64>
}

was

#map = affine_map<(d0) -> (0)>
#map1 = affine_map<(d0) -> (d0)>
module {
  func.func @test_cast_fp32_i64(%arg0: tensor<1xf32>) -> tensor<1xui64> {
    %0 = tensor.empty() : tensor<1xi64>
    %1 = linalg.generic {indexing_maps = [#map, #map1], iterator_types = ["parallel"]} ins(%arg0 : tensor<1xf32>) outs(%0 : tensor<1xi64>) {
    ^bb0(%in: f32, %out: i64):
      %3 = math.roundeven %in : f32
      %cst = arith.constant -9.22337203E+18 : f32
      %cst_0 = arith.constant -9.22337203E+18 : f32
      %c9223372036854775807_i64 = arith.constant 9223372036854775807 : i64
      %4 = arith.maximumf %3, %cst : f32
      %5 = arith.fptosi %4 : f32 to i64
      %6 = arith.cmpf uge, %3, %cst_0 : f32
      %7 = arith.select %6, %c9223372036854775807_i64, %5 : i64
      linalg.yield %7 : i64
    } -> tensor<1xi64>
    %2 = builtin.unrealized_conversion_cast %1 : tensor<1xi64> to tensor<1xui64>
    return %2 : tensor<1xui64>
  }
}

@CoTinker CoTinker requested a review from RoboTux October 16, 2024 01:30
@GeorgeARM
Copy link
Contributor

GeorgeARM commented Oct 16, 2024

So it's possible to add a verifier to error out for such tosa ops instead of producing incorrect code. However, I chose to fix the overflow issue to still be able to represent the deeplabv3 model with tosa ops in the above-mentioned pipeline. Open to suggestions if adding the verifier is more appropriate instead.

@sahas3 IMHO I think is better to fix; my expectation is that at some point TOSA will get i64 support.
Moreover, as you mentioned is good to get models legalizing rather than explicitly restricting.

Please wait also for @RoboTux to review the patch before merging.

Copy link

github-actions bot commented Oct 16, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants