Optimize fptrunc(x)>=C1 --> x>=C2 #99475

kissholic · 2024-07-18T11:55:36Z

github-actions · 2024-07-18T11:55:53Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-07-18T11:56:26Z

@llvm/pr-subscribers-llvm-transforms

Author: None (kissholic)

Changes

Fix #85265 (comment)

Full diff: https://github.com/llvm/llvm-project/pull/99475.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+31)
(added) llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll (+11)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index abadf54a96767..2af3e92213f13 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -22,10 +22,13 @@
 #include "llvm/Analysis/Utils/Local.h"
 #include "llvm/Analysis/VectorUtils.h"
 #include "llvm/IR/ConstantRange.h"
+#include "llvm/IR/Constants.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/Instruction.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/PatternMatch.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/KnownBits.h"
 #include "llvm/Transforms/InstCombine/InstCombiner.h"
 #include <bitset>
@@ -7882,6 +7885,30 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,
   return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I);
 }
 
+// Fold trunc(x) < constant --> x < constant if possible.
+static Instruction *foldFCmpFpTrunc(FCmpInst &I, Instruction *LHSI,
+                                    Constant *RHSC) {
+  //
+  FCmpInst::Predicate Pred = I.getPredicate();
+
+  // Check that predicates are valid.
+  if ((Pred != FCmpInst::FCMP_OGT) && (Pred != FCmpInst::FCMP_OLT) &&
+      (Pred != FCmpInst::FCMP_OGE) && (Pred != FCmpInst::FCMP_OLE))
+    return nullptr;
+
+  auto *LType = LHSI->getOperand(0)->getType();
+  auto *RType = RHSC->getType();
+
+  if (!(LType->isFloatingPointTy() && RType->isFloatingPointTy() &&
+        LType->getTypeID() >= RType->getTypeID()))
+    return nullptr;
+
+  auto *ROperand = llvm::ConstantFP::get(
+      LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());
+
+  return new FCmpInst(Pred, LHSI->getOperand(0), ROperand, "", &I);
+}
+
 /// Optimize fabs(X) compared with zero.
 static Instruction *foldFabsWithFcmpZero(FCmpInst &I, InstCombinerImpl &IC) {
   Value *X;
@@ -8244,6 +8271,10 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) {
                   cast<LoadInst>(LHSI), GEP, GV, I))
             return Res;
       break;
+    case Instruction::FPTrunc:
+      if (Instruction *NV = foldFCmpFpTrunc(I, LHSI, RHSC))
+        return NV;
+      break;
   }
   }
 
diff --git a/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll b/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll
new file mode 100644
index 0000000000000..446111a60dd6c
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll
@@ -0,0 +1,11 @@
+; RUN: opt -passes=instcombine -S < %s | FileCheck %s
+
+
+;CHECK-LABEL: @src(
+;CHECK: %result = fcmp oge double %0, 1.000000e+02
+;CHECK-NEXT: ret i1 %result
+define i1 @src(double %0) {
+    %trunc = fptrunc double %0 to float
+    %result = fcmp oge float %trunc, 1.000000e+02
+    ret i1 %result
+}
\ No newline at end of file

dtcxzyw · 2024-07-18T11:59:23Z

Please read the guideline https://llvm.org/docs/InstCombineContributorGuide.html.

dtcxzyw · 2024-07-18T11:59:44Z

llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll

+    %trunc = fptrunc double %0 to float
+    %result = fcmp oge float %trunc, 1.000000e+02
+    ret i1 %result
+}


Missing newline.

arsenm · 2024-07-18T12:17:55Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+  auto *LType = LHSI->getOperand(0)->getType();
+  auto *RType = RHSC->getType();
+
+  if (!(LType->isFloatingPointTy() && RType->isFloatingPointTy() &&


You shouldn't need to check isFloatingPointTy, that's implied by the operations in use already. Also not sure what a >= means when comparing type IDs but you probably don't need that either

arsenm · 2024-07-18T12:18:10Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    return nullptr;
+
+  auto *ROperand = llvm::ConstantFP::get(
+      LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());


Don't use convertToDouble, keep this entirely in APFloat

arsenm · 2024-07-18T12:18:29Z

llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll

+    %trunc = fptrunc double %0 to float
+    %result = fcmp oge float %trunc, 1.000000e+02
+    ret i1 %result
+}


Test more combinations of types, including vectors. Also test flag preservation behavior

A test of fp128, x86_fp80, or ppc_fp128 in particular would be helpful.

Test more combinations of types, including vectors. Also test flag preservation behavior
@arsenm Sorry, could you give me a hint what 'flag preservation behavior' means in IR?

I mean the flags on the compare should be preserved, and you don't have any tests using fast math flags. e.g. https://alive2.llvm.org/ce/z/uQr4-J

A test of fp128, x86_fp80, or ppc_fp128 in particular would be helpful.

Sorry for late.

I tried to test these types, but an error is generated whose message is "floating point constant does not have type 'fp128'".

The error is emitted in the parse stage, and more modifications might be required to be conducted.

Should i ignore this problem, or if there are better solutions?

The IR syntax for floating point constants is unnecessarily painful. You need to use a different prefix and have the correct hex length depending on the format. For this I usually just write fpext from a reasonable FP type to the long one, and run it through instsimplify to see how it should be printed

aengelke · 2024-07-18T15:41:21Z

This appears to be incorrect w.r.t. round-to-nearest rounding of fptrunc. alive2 The constant needs adjustment.

kissholic · 2024-07-25T01:01:22Z

This appears to be incorrect w.r.t. round-to-nearest rounding of fptrunc. alive2 The constant needs adjustment.

It seems that the double type the fp constant converted to can express the same value with float type without lossing accuracy. The rmNearestTiesToEven has already been applied, and no difference appeared.🫣

arsenm · 2024-07-25T08:55:10Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+
+  if (RHSC->getType()->isVectorTy()) {
+    Type *LVecType = LHSI->getOperand(0)->getType();
+    Type *LEleType = dyn_cast<VectorType>(LVecType)->getElementType();


unchecked dyn_cast. Use the dyn_cast in the if expression instead of using isVectorTy

arsenm · 2024-07-25T08:55:56Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

-  auto *ROperand = llvm::ConstantFP::get(
-      LType, dyn_cast<ConstantFP>(RHSC)->getValue().convertToDouble());
+    std::vector<Constant *> EleVec(EleNum);
+    for (uint64_t Idx = 0; Idx < EleNum; ++Idx) {


This is trying too hard to handle the non-splat case. Just use m_APFloat which handles splat vectors and scalars at the same time

So i should split splat vector case from the normal cases, and combine it with scalars, if i understand correctly?😖

You shouldn't have to really worry about the vector case. If you use m_APFloat, it should just work. It will handle scalars and splat vectors

You shouldn't have to really worry about the vector case. If you use m_APFloat, it should just work. It will handle scalars and splat vectors

Sorry, i still didn't get the point... It seems that m_APFloat can't cover the non-splat vector cases.

I also read the icmp-trunc optimization code and ran an int vector test case, but it seems not work in the int vector case.

I came up with an idea that FPExtInst may be applied to the constant, and left the optimization to constant extension part 😋 (joking).

Could you give some more hints? Thank you <3

It seems that m_APFloat can't cover the non-splat vector cases.

Correct. I'm saying it's a waste of time, and will multiply the patch size, to handle non-splat cases. If you really want to handle non-splat cases, it should be a follow up after the simple patch

arsenm · 2024-07-25T08:56:16Z

llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll

+  ret <4 x i1> %cmp
+}
+


Add a scalable vector test

aengelke · 2024-07-25T09:05:59Z

It seems that the double type the fp constant converted to can express the same value with float type without lossing accuracy. The rmNearestTiesToEven has already been applied, and no difference appeared.🫣

It's the opposite direction that is problematic. Consider input 99.99999999. After the fptrunc, it will be value 100.0f, which is >= 100.0f. But 99.99999999 is not >= 100.0. You need to find the smallest(/largest) value of the larger floating-point type which, after truncation, is satisfies the condition.

The contributor guide also says that you should provide alive2 proofs that your transformation is correct. I provided a proof above that the transformation as implemented/tested now is not correct.

arsenm

Can you add the alive2 link proof to the description

arsenm · 2024-08-16T13:30:57Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

@@ -7882,6 +7892,79 @@ static Instruction *foldFCmpReciprocalAndZero(FCmpInst &I, Instruction *LHSI,
  return new FCmpInst(Pred, LHSI->getOperand(1), RHSC, "", &I);
 }

+// Fold trunc(x) < constant --> x < constant if possible.


fptrunc, not trunc

arsenm · 2024-08-16T13:31:33Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+  if ((Pred == FCmpInst::FCMP_OGE) || (Pred == FCmpInst::FCMP_UGE) ||
+      (Pred == FCmpInst::FCMP_OLT) || (Pred == FCmpInst::FCMP_ULT))
+    RoundDown = true;
+  else if ((Pred == FCmpInst::FCMP_OGT) || (Pred == FCmpInst::FCMP_UGT) ||
+           (Pred == FCmpInst::FCMP_OLE) || (Pred == FCmpInst::FCMP_ULE))


Don't need parentheses around all these == expressions

arsenm · 2024-08-16T13:33:39Z

llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll

+  %result = fcmp fast oge float %trunc, 1.000000e+02
+  ret i1 %result
+}
+


The set of tested constants seems too simple for the complexity of the loop testing constant validity. Should have negative tests for off by one bit in each direction. Also test with the edge case constants (inf, nan) and some denormal values?

arsenm · 2024-08-16T13:36:24Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+      APFloat DupUpBound = UpBound;
+      DupUpBound.next(true);


Name UpBoundNext, or similar?

arsenm · 2024-08-16T13:37:10Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    APFloat LowBound = RoundDown ? ExtNextRValue : ExtRValue;
+    APFloat UpBound = RoundDown ? ExtRValue : ExtNextRValue;
+
+    while (true) {


This loop is the hard to review part and needs some comments explaining what constants are legal

Sorry for late, being occupied by work.

The alive proof will be posted soon.

The edge cases (inf, nan) has been folded before entering this optimization (e.g. fcmp oge x inf --> fcmp oeq x inf). Maybe filtering these cases is a good idea?

It's not always safe to rely on those folding before hand

kissholic · 2024-10-20T07:16:55Z

Exclude nan and infinity from optimization. The two 'number' requires special comparison rules, and have been optimized well by other methods, which generate bool literal directly.

Also add special treatment for the max (and the min) representable float value, due to their next value is infinity.

alive2 proof:
https://alive2.llvm.org/ce/z/rphqKP
https://alive2.llvm.org/ce/z/UhU5nG
https://alive2.llvm.org/ce/z/v3zu93
https://alive2.llvm.org/ce/z/fQxsaA
https://alive2.llvm.org/ce/z/UYnaSb
https://alive2.llvm.org/ce/z/5VRZGg

arsenm · 2024-10-20T23:00:22Z

llvm/test/Transforms/InstCombine/fold-fcmp-trunc.ll

+  %result = fcmp olt float %trunc, -3.4028234663852885981170418348451692544e38
+  ret i1 %result
+}
+


Test the literal nan and inf cases

arsenm · 2024-10-20T23:05:52Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+  if (Pred == FCmpInst::FCMP_OGE || Pred == FCmpInst::FCMP_UGE ||
+      Pred == FCmpInst::FCMP_OLT || Pred == FCmpInst::FCMP_ULT)
+    RoundDown = true;
+  else if (Pred == FCmpInst::FCMP_OGT || Pred == FCmpInst::FCMP_UGT ||
+           Pred == FCmpInst::FCMP_OLE || Pred == FCmpInst::FCMP_ULE)
+    RoundDown = false;
+  else


uge, ule cases not tested. Plus negative test for others

arsenm · 2024-10-20T23:07:22Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+    // Set the limit of ExtNextRValue.
+    if (NextRValue.isInfinity()) {
+      ExtNextRValue = ExtRValue * Two;
+    }


No braces. Can also defer construction of the 2 constant (or avoid it by using scalbn instead)

arsenm · 2024-10-20T23:08:21Z

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

+  ExtNextRValue.convert(LEleType->getFltSemantics(),
+                        APFloat::rmNearestTiesToEven, &lossInfo);
+
+  // Binary search to find the maximal (or minimal) value after RValue promotion.


Can you write this with std::lower_bound/std::upper_bound?

std::lower_bound/std::upper_bound (or similar algorithm like std::binary_search) seems only accept ForwardIt type and there is likely no suitable substitution algorithm in LLVM too, which may requires constructing a new complex wrapper struct of APFloat, such as implementing name required iteration methods, calculating the mean of two APFloat (without constructing a new APFloat divisor), defining a comparation function and so on.

Considering the internal complexity of APFloat wrapper, it is simpler to keep the original one (i think)?

github-actions · 2024-10-20T23:09:00Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 4a19be5d45e4b1e02c2512023151be5d56ef5744 5cc33ac5e033690481505cb722695fbf3d345478 --extensions cpp -- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 8894a337edd..adaac13a2ad 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -7932,15 +7932,15 @@ static Instruction *foldFCmpFpTrunc(FCmpInst &I, Instruction *LHSI,
   ExtNextRValue.convert(LEleType->getFltSemantics(),
                         APFloat::rmNearestTiesToEven, &lossInfo);
 
-  // Binary search to find the maximal (or minimal) value after RValue promotion.
-  // RValue can't have special comparison rules, which means nan or inf is not
-  // allowed here.
+  // Binary search to find the maximal (or minimal) value after RValue
+  // promotion. RValue can't have special comparison rules, which means nan or
+  // inf is not allowed here.
   APFloat RoundValue{LEleType->getFltSemantics()};
   {
     APFloat Two{LEleType->getFltSemantics(), 2};
 
-    // The (negative) maximum of RValue will become infinity when rounded up (down).
-    // Set the limit of ExtNextRValue.
+    // The (negative) maximum of RValue will become infinity when rounded up
+    // (down). Set the limit of ExtNextRValue.
     if (NextRValue.isInfinity()) {
       ExtNextRValue = ExtRValue * Two;
     }

kissholic · 2025-01-23T02:00:02Z

More test cases are added, and the refactor partial logics of 'foldFCmpFpTrunc'

https://alive2.llvm.org/ce/z/nodhVp
https://alive2.llvm.org/ce/z/PtGyK-
https://alive2.llvm.org/ce/z/QN4uVV
https://alive2.llvm.org/ce/z/GTD7r2
https://alive2.llvm.org/ce/z/JUPPTp
https://alive2.llvm.org/ce/z/yubkuX

arsenm · 2025-03-16T05:15:20Z

Why was this closed?

kissholic · 2025-03-16T08:40:44Z

Why was this closed?

Sorry, the old commits seem to be blocked due to a merge operation. I tried to discard those commits and push it again.

kissholic · 2025-05-30T01:13:59Z

ping

kissholic requested a review from nikic as a code owner July 18, 2024 11:55

llvmbot added the llvm:transforms label Jul 18, 2024

dtcxzyw requested a review from arsenm July 18, 2024 11:58

dtcxzyw reviewed Jul 18, 2024

View reviewed changes

arsenm reviewed Jul 18, 2024

View reviewed changes

arsenm added llvm:instcombine and removed llvm:transforms labels Jul 18, 2024

llvmbot added the llvm:transforms label Jul 25, 2024

arsenm reviewed Jul 25, 2024

View reviewed changes

arsenm reviewed Aug 16, 2024

View reviewed changes

kissholic added a commit to kissholic/llvm-project that referenced this pull request Oct 20, 2024

llvm#99475 Exclude nan and inf from fcmp fold

5cc33ac

arsenm added the floating-point Floating-point math label Oct 20, 2024

arsenm reviewed Oct 20, 2024

View reviewed changes

kissholic closed this Mar 16, 2025

kissholic force-pushed the main branch from 4df7b93 to e30a5d6 Compare March 16, 2025 03:01

Optimize fptrunc(x)>=C1 --> x>=C2

5fabe63

kissholic reopened this Mar 16, 2025

Optimize fptrunc(x)>=C1 --> x>=C2 #99475

Are you sure you want to change the base?

Optimize fptrunc(x)>=C1 --> x>=C2 #99475

Uh oh!

Conversation

kissholic commented Jul 18, 2024

Uh oh!

github-actions bot commented Jul 18, 2024

Uh oh!

llvmbot commented Jul 18, 2024

Uh oh!

dtcxzyw commented Jul 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aengelke commented Jul 18, 2024

Uh oh!

kissholic commented Jul 25, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aengelke commented Jul 25, 2024

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kissholic commented Oct 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

kissholic commented Oct 20, 2024 •

edited

Loading