[mlir][arith] fix wrong floordivsi fold (#83079)#83248
Conversation
|
@llvm/pr-subscribers-llvm-adt @llvm/pr-subscribers-mlir-arith Author: long.chen (lipracer) ChangesFixs #83079 Full diff: https://github.com/llvm/llvm-project/pull/83248.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Arith/IR/ArithOps.cpp b/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
index 0f71c19c23b654..d370b7d04dea7e 100644
--- a/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
+++ b/mlir/lib/Dialect/Arith/IR/ArithOps.cpp
@@ -709,19 +709,25 @@ OpFoldResult arith::FloorDivSIOp::fold(FoldAdaptor adaptor) {
}
if (!aGtZero && !bGtZero) {
// Both negative, return -a / -b.
- APInt posA = zero.ssub_ov(a, overflowOrDiv0);
- APInt posB = zero.ssub_ov(b, overflowOrDiv0);
- return posA.sdiv_ov(posB, overflowOrDiv0);
+ return a.sdiv_ov(b, overflowOrDiv0);
}
if (!aGtZero && bGtZero) {
// A is negative, b is positive, return - ceil(-a, b).
APInt posA = zero.ssub_ov(a, overflowOrDiv0);
+ if (overflowOrDiv0)
+ return a;
APInt ceil = signedCeilNonnegInputs(posA, b, overflowOrDiv0);
+ if (overflowOrDiv0)
+ return a;
return zero.ssub_ov(ceil, overflowOrDiv0);
}
// A is positive, b is negative, return - ceil(a, -b).
APInt posB = zero.ssub_ov(b, overflowOrDiv0);
+ if (overflowOrDiv0)
+ return a;
APInt ceil = signedCeilNonnegInputs(a, posB, overflowOrDiv0);
+ if (overflowOrDiv0)
+ return a;
return zero.ssub_ov(ceil, overflowOrDiv0);
});
diff --git a/mlir/test/Transforms/canonicalize.mlir b/mlir/test/Transforms/canonicalize.mlir
index 2cf86b50d432f6..d2c2c12d323892 100644
--- a/mlir/test/Transforms/canonicalize.mlir
+++ b/mlir/test/Transforms/canonicalize.mlir
@@ -989,6 +989,15 @@ func.func @tensor_arith.floordivsi_by_one(%arg0: tensor<4x5xi32>) -> tensor<4x5x
return %res : tensor<4x5xi32>
}
+// CHECK-LABEL: func @arith.floordivsi_by_one_overflow
+func.func @arith.floordivsi_by_one_overflow() -> i64 {
+ %neg_one = arith.constant -1 : i64
+ %min_int = arith.constant -9223372036854775808 : i64
+ // CHECK: arith.floordivsi
+ %poision = arith.floordivsi %min_int, %neg_one : i64
+ return %poision : i64
+}
+
// -----
// CHECK-LABEL: func @arith.ceildivsi_by_one
|
kuhar
left a comment
There was a problem hiding this comment.
Looks fine upon a quick scan of the patch. Given the number or error code paths, I think it could be made more readable with something like constFoldBinaryOpConditional IIRC.
|
Not sure this is the right patch, please see this comment |
|
@kuhar There are indeed many overflow states that need to be checked here, but I haven't come up with a good method yet. Can you give me a specific suggestion? Thank you. |
Without knowing the exact details of the floordivsi algorithm, the way I'd think about it is that we only need to detect if the final overflow happened or not. And of course fold correctly when there is no overflow. If there are intermediate overflows, I'd think that they can fold into one of the two buckets: (a) those that indicate that the floordivsi overflows for these arguments, and (b) implementation bugs. It might be helpful to extract this fold code to a unit test (say one of the APInt test files) and go from there. Pick a few inputs that are known to overflow and trace the values and intermediate overflows. Maybe write a separate check to decide if overflow happens for the given input instead of relying on some intermediate checks. |
|
@lipracer Another idea: we could perform the division on APInt with more bits than the bitwidth of the folded integer type. Maybe this way it'd be easier to detect overflow. And then once we are confident there's no overflow, truncate it back to the desired bitwidth. |
|
Ok. |
f231e71 to
0939c26
Compare
|
submit a APInt change PR. |
c241567 to
fcaa9e5
Compare
APInt change has already merged. |
kuhar
left a comment
There was a problem hiding this comment.
Looks fine upon a quick scan but I won't have the time to understand the details of the expansion in near future. Please get a second approval before landing.
|
Thank you very much for your valuable suggestion. |
|
@joker-eph I'd like someone to review my code, could you help me with that? |
Mogball
left a comment
There was a problem hiding this comment.
This change almost certainly has performance implications. Do we have some way to measure or even guess at that? If this is fixing a correctness bug, I am also not the right person to verify that, but I trust the author in this
|
@pingshiyu would you be able to review the correctness of this change? |
Yes, this may indeed affect performance. This implementation is based on tensorflow xla expansion. For performance considerations, I think we can have a compilation option similar to ‘fast-expand’ for expansion. This change is only consistent with the behavior of llvm. |
sure, I've just taken some time to check through the cases, lgtm re correctness. thanks @lipracer! maybe some extra tests would be nice for the edge cases, e.g. |
|
@pingshiyu Thanks, I have added more corner tests. |
|
The CI error seems unrelated to this change. If the performance drops after merging, I will submit another change to enable "fast expand". |
|
The premerge is green at HEAD: can you rebase? https://lab.llvm.org/buildbot/#/builders/271 |
|
Nevermind I misread the log, it's failing on building flang by running out of memory: I concur that it is unrelated, and all the MLIR tests are passing. |
|
Thanks, I will rebase on head. |
f3bcea4 to
1217ab5
Compare
1) fix floordivsi error expand logic 2) fix floordivsi fold did't check overflow stat Fixs llvm#83079
Fixs #83079