Skip to content

[ValueTracking] Compute knownbits from known fp classes #86409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 13, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Mar 23, 2024

This patch calculates knownbits from fp instructions/dominating fcmp conditions. It will enable more optimizations with signbit idioms.

@llvmbot
Copy link
Member

llvmbot commented Mar 23, 2024

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch calculates knownbits from fp instructions/dominating fcmp conditions. It will enable more optimizations with signbit idioms.


Full diff: https://github.com/llvm/llvm-project/pull/86409.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+5)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+31)
  • (modified) llvm/test/Transforms/InstCombine/known-bits.ll (+140)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index 3970efba18cc8c..697482caed6c4a 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -263,6 +263,11 @@ struct KnownFPClass {
     return (KnownFPClasses & Mask) == fcNone;
   }
 
+  /// Return true if it's known this can only be one of the mask entries.
+  bool isKnownOnly(FPClassTest Mask) const {
+    return (KnownFPClasses & ~Mask) == fcNone;
+  }
+
   bool isUnknown() const {
     return KnownFPClasses == fcAllFlags && !SignBit;
   }
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 797665cf06c875..a2bcf1d278829c 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1105,6 +1105,37 @@ static void computeKnownBitsFromOperator(const Operator *I,
       break;
     }
 
+    Value *V;
+    // Handle bitcast from floating point to integer.
+    if (match(const_cast<Operator *>(I), m_ElementWiseBitCast(m_Value(V))) &&
+        V->getType()->isFPOrFPVectorTy()) {
+      KnownFPClass Result = computeKnownFPClass(V, fcAllFlags, Depth + 1, Q);
+      if (Result.SignBit) {
+        if (*Result.SignBit)
+          Known.makeNegative();
+        else
+          Known.makeNonNegative();
+      }
+
+      Type *FPType = V->getType()->getScalarType();
+      int MantissaWidth = FPType->getFPMantissaWidth();
+      if (MantissaWidth != -1) {
+        if (Result.isKnownOnly(fcInf)) {
+          Known.Zero.setLowBits(MantissaWidth);
+          Known.One.setBits(MantissaWidth, BitWidth - 1);
+        } else if (Result.isKnownOnly(fcZero))
+          Known.Zero.setLowBits(BitWidth - 1);
+        else if (Result.isKnownOnly(fcInf | fcNan))
+          Known.One.setBits(MantissaWidth, BitWidth - 1);
+        else if (Result.isKnownOnly(fcSubnormal | fcZero))
+          Known.Zero.setBits(MantissaWidth, BitWidth - 1);
+        else if (Result.isKnownOnly(fcInf | fcZero))
+          Known.Zero.setLowBits(MantissaWidth);
+      }
+
+      break;
+    }
+
     // Handle cast from vector integer type to scalar or vector integer.
     auto *SrcVecTy = dyn_cast<FixedVectorType>(SrcTy);
     if (!SrcVecTy || !SrcVecTy->getElementType()->isIntegerTy() ||
diff --git a/llvm/test/Transforms/InstCombine/known-bits.ll b/llvm/test/Transforms/InstCombine/known-bits.ll
index 58c283815cf910..60c94f99298235 100644
--- a/llvm/test/Transforms/InstCombine/known-bits.ll
+++ b/llvm/test/Transforms/InstCombine/known-bits.ll
@@ -483,5 +483,145 @@ if.else:
   ret i64 13
 }
 
+define i1 @test_sign_pos(float %x) {
+; CHECK-LABEL: @test_sign_pos(
+; CHECK-NEXT:    ret i1 true
+;
+  %fabs = call float @llvm.fabs.f32(float %x)
+  %y = bitcast float %fabs to i32
+  %sign = icmp sgt i32 %y, -1
+  ret i1 %sign
+}
+
+define i1 @test_sign_neg(float %x) {
+; CHECK-LABEL: @test_sign_neg(
+; CHECK-NEXT:    ret i1 true
+;
+  %fabs = call float @llvm.fabs.f32(float %x)
+  %fnabs = fneg float %fabs
+  %y = bitcast float %fnabs to i32
+  %sign = icmp slt i32 %y, 0
+  ret i1 %sign
+}
+
+define <2 x i1> @test_sign_pos_vec(<2 x float> %x) {
+; CHECK-LABEL: @test_sign_pos_vec(
+; CHECK-NEXT:    ret <2 x i1> zeroinitializer
+;
+  %fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %x)
+  %y = bitcast <2 x float> %fabs to <2 x i32>
+  %sign = icmp slt <2 x i32> %y, zeroinitializer
+  ret <2 x i1> %sign
+}
+
+define i32 @test_inf_only(float nofpclass(nan sub norm zero) %x) {
+; CHECK-LABEL: @test_inf_only(
+; CHECK-NEXT:    ret i32 2130706432
+;
+  %y = bitcast float %x to i32
+  %and = and i32 %y, 2147483647
+  ret i32 %and
+}
+
+define i32 @test_zero_only(float nofpclass(nan sub norm inf) %x) {
+; CHECK-LABEL: @test_zero_only(
+; CHECK-NEXT:    ret i32 0
+;
+  %y = bitcast float %x to i32
+  %and = and i32 %y, 2147483647
+  ret i32 %and
+}
+
+define i32 @test_inf_nan_only(float nofpclass(sub norm zero) %x) {
+; CHECK-LABEL: @test_inf_nan_only(
+; CHECK-NEXT:    ret i32 2130706432
+;
+  %y = bitcast float %x to i32
+  %and = and i32 %y, 2130706432
+  ret i32 %and
+}
+
+define i32 @test_sub_zero_only(float nofpclass(nan norm inf) %x) {
+; CHECK-LABEL: @test_sub_zero_only(
+; CHECK-NEXT:    ret i32 0
+;
+  %y = bitcast float %x to i32
+  %and = and i32 %y, 2130706432
+  ret i32 %and
+}
+
+define i32 @test_inf_zero_only(float nofpclass(nan norm sub) %x) {
+; CHECK-LABEL: @test_inf_zero_only(
+; CHECK-NEXT:    ret i32 0
+;
+  %y = bitcast float %x to i32
+  %and = and i32 %y, 16777215
+  ret i32 %and
+}
+
+define i1 @test_simplify_icmp(i32 %x) {
+; CHECK-LABEL: @test_simplify_icmp(
+; CHECK-NEXT:    ret i1 false
+;
+  %conv.i.i = uitofp i32 %x to double
+  %3 = bitcast double %conv.i.i to i64
+  %shr.i.mask.i = and i64 %3, -140737488355328
+  %cmp.i = icmp eq i64 %shr.i.mask.i, -1970324836974592
+  ret i1 %cmp.i
+}
+
+define i16 @test_simplify_mask(i32 %ui, float %x) {
+; CHECK-LABEL: @test_simplify_mask(
+; CHECK-NEXT:    [[CONV:%.*]] = uitofp i32 [[UI:%.*]] to float
+; CHECK-NEXT:    [[CMP:%.*]] = fcmp ogt float [[CONV]], [[X:%.*]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]]
+; CHECK:       if.end:
+; CHECK-NEXT:    ret i16 31744
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i16 0
+;
+  %conv = uitofp i32 %ui to float
+  %cmp = fcmp olt float %x, %conv
+  br i1 %cmp, label %if.else, label %if.end
+
+if.end:
+  %cast = bitcast float %conv to i32
+  %shr = lshr i32 %cast, 16
+  %trunc = trunc i32 %shr to i16
+  %and = and i16 %trunc, -32768
+  %or = or disjoint i16 %and, 31744
+  ret i16 %or
+
+if.else:
+  ret i16 0
+}
+
+; TODO: %cmp always evaluates to false
+
+define i1 @test_simplify_icmp2(double %x) {
+; CHECK-LABEL: @test_simplify_icmp2(
+; CHECK-NEXT:    [[ABS:%.*]] = tail call double @llvm.fabs.f64(double [[X:%.*]])
+; CHECK-NEXT:    [[COND:%.*]] = fcmp oeq double [[ABS]], 0x7FF0000000000000
+; CHECK-NEXT:    br i1 [[COND]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[CAST:%.*]] = bitcast double [[X]] to i64
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i64 [[CAST]], 3458764513820540928
+; CHECK-NEXT:    ret i1 [[CMP]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i1 false
+;
+  %abs = tail call double @llvm.fabs.f64(double %x)
+  %cond = fcmp oeq double %abs, 0x7FF0000000000000
+  br i1 %cond, label %if.then, label %if.else
+
+if.then:
+  %cast = bitcast double %x to i64
+  %cmp = icmp eq i64 %cast, 3458764513820540928
+  ret i1 %cmp
+
+if.else:
+  ret i1 false
+}
+
 declare void @use(i1)
 declare void @sink(i8)

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 23, 2024
Copy link

✅ With the latest revision this PR passed the Python code formatter.

Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 24, 2024

Failed Tests (3):
LLVM :: CodeGen/AMDGPU/amdgpu-simplify-libcall-pow.ll
LLVM :: CodeGen/AMDGPU/amdgpu-simplify-libcall-pown.ll
LLVM :: CodeGen/AMDGPU/simplify-libcalls.ll

@dtcxzyw dtcxzyw force-pushed the perf/fpclass-knownbits branch 2 times, most recently from c488c79 to 459641e Compare April 12, 2024 08:24
; CHECK-NEXT: ret i1 [[SIGN]]
;
%fabs = call <2 x half> @llvm.fabs.v2f16(<2 x half> %x)
%y = bitcast <2 x half> %fabs to i32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a potential todo, we could handle vec of smaller type -> bigger type. The only case that is properly tricky is bigger type -> vec of small type.

@goldsteinn
Copy link
Contributor

LGTM. Wait on one more.

@dtcxzyw dtcxzyw force-pushed the perf/fpclass-knownbits branch from 459641e to d163f9c Compare April 13, 2024 17:45
@dtcxzyw dtcxzyw force-pushed the perf/fpclass-knownbits branch from d163f9c to 937a405 Compare April 22, 2024 07:21
Comment on lines 1126 to 1142
if (FPClasses & fcSNan) {
APInt Payload = APInt::getAllOnes(FPType->getScalarSizeInBits());
Known = Known.intersectWith(KnownBits::makeConstant(
APFloat::getSNaN(FPType->getFltSemantics()).bitcastToAPInt()));
Known = Known.intersectWith(KnownBits::makeConstant(
APFloat::getSNaN(FPType->getFltSemantics(), &Payload)
.bitcastToAPInt()));
}

if (FPClasses & fcQNan) {
APInt Payload = APInt::getAllOnes(FPType->getScalarSizeInBits());
Known = Known.intersectWith(KnownBits::makeConstant(
APFloat::getQNaN(FPType->getFltSemantics()).bitcastToAPInt()));
Known = Known.intersectWith(KnownBits::makeConstant(
APFloat::getQNaN(FPType->getFltSemantics(), &Payload)
.bitcastToAPInt()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the logic for signaling nans and quiet nans can be the same. Assuming 2008 nans, any non-0 bit in the significant indicates quietness

Can you add more tests for the quiet bit set in non-standard positions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This approach didn't work (see 57b3d5e), so I removed them.

@dtcxzyw dtcxzyw merged commit d03a1a6 into llvm:main May 13, 2024
3 of 4 checks passed
@dtcxzyw dtcxzyw deleted the perf/fpclass-knownbits branch May 13, 2024 17:58
@mstorsjo
Copy link
Member

This change causes failed assertions, reproducible like this:

typedef union {
  double a;
  struct {
    int b
  } c
} d;
int e;
d f;
int g(double k) {
  f.a = k;
  e = f.c.b;
}
double h();
void i() {
  double j = h(), a = -__builtin_huge_val();
  j != a || g(j);
}
$ clang -target x86_64-linux-gnu -c repro.c -O2
clang: ../lib/Analysis/ValueTracking.cpp:1152: void computeKnownBitsFromOperator(const llvm::Operator*, const llvm::APInt&, llvm::KnownBits&, unsigned int, const llvm::SimplifyQuery&): Assertion `!Known.hasConflict() && "Bits known to be one AND zero?"' failed.

I'll push a revert shortly.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 14, 2024

This change causes failed assertions, reproducible like this:

typedef union {
  double a;
  struct {
    int b
  } c
} d;
int e;
d f;
int g(double k) {
  f.a = k;
  e = f.c.b;
}
double h();
void i() {
  double j = h(), a = -__builtin_huge_val();
  j != a || g(j);
}
$ clang -target x86_64-linux-gnu -c repro.c -O2
clang: ../lib/Analysis/ValueTracking.cpp:1152: void computeKnownBitsFromOperator(const llvm::Operator*, const llvm::APInt&, llvm::KnownBits&, unsigned int, const llvm::SimplifyQuery&): Assertion `!Known.hasConflict() && "Bits known to be one AND zero?"' failed.

I'll push a revert shortly.

Thank you for reporting this!
I will fix it later.

mstorsjo added a commit that referenced this pull request May 14, 2024
…)"

This reverts commit d03a1a6.

This change caused failed assertions, see
#86409 (comment)
for details.
@dtcxzyw
Copy link
Member Author

dtcxzyw commented May 14, 2024

Reduced test case:

define i32 @test(double %x) {
  %cmp = fcmp oeq double %x, 0xFFF0000000000000
  br i1 %cmp, label %if.then, label %if.else

if.then:
  %cast = bitcast double %x to i64
  %trunc = trunc i64 %cast to i32
  ret i32 %trunc

if.else:
  ret i32 0
}

I will post a fix.

@mstorsjo
Copy link
Member

Reduced test case:

define i32 @test(double %x) {
  %cmp = fcmp oeq double %x, 0xFFF0000000000000
  br i1 %cmp, label %if.then, label %if.else

if.then:
  %cast = bitcast double %x to i64
  %trunc = trunc i64 %cast to i32
  ret i32 %trunc

if.else:
  ret i32 0
}

I will post a fix.

Thanks! I pushed the revert already in 2e165a2 though, so I guess it mostly makes sense as a new PR for relanding this change.

dtcxzyw added a commit to dtcxzyw/llvm-project that referenced this pull request May 14, 2024
dtcxzyw added a commit that referenced this pull request May 14, 2024
)

This patch relands #86409.

I mistakenly thought that `Known.makeNegative()` clears the sign bit of
`Known.Zero`. This patch fixes the assertion failure by explicitly
clearing the sign bit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants