Skip to content

Commit 81d46eb

Browse files
authored
Fix and test Float16 Rational comparison (#58217)
Found and fix recommended by Eric Demer. Previously, we had this non-transitivity: ```julia-repl julia> Float16(6.0e-8) == big(1//16777216) == 1//16777216 true julia> Float16(6.0e-8) == 1//16777216 false ``` From Eric, > This is presumably not the preferred way to inform you of such issues, but > like last time, the other ways I found have Terms that I am not willing to agree to and > this time, I imagine it is in fact a bug, rather than just a foot-gun > . > Multiplication of Float16 s by the built-in BitInteger types hits the generic > ::Number fallback, which just promotes the inputs and then multiplies the results. > These pairs of inputs promote to Float16, so for Int32,UInt32 and larger, > multiplying this way can give Inf16 even when [[the Float16 input is neither > NaN nor infinite] and [the exact product has absolute value at most 1]]. > While the decision for the above specifically might be > "That's just what one gets for doing arithmetic with Float16." > , a current consequence of this is that > == between Float16 s and Rationals for the wide-enough built-in BitInteger types > gives wrong output, and as a result, isequal is not transitive: > https://paste.ee/p/LfxRlI7F has a paste of a REPL session demonstrating this. > Ideally, one would define arithmetic operations between > built-in BitFloats and built-in BitIntegers so that when there > are no nans and no infs, the output is == to the result of > convert both inputs to Rational{BigInt}, multiply those Rationals, and then round. > > The much-easier-to-implement patch, on the other hand, > would be just defining ==(x::Float16,q::Rational) = ==(Float32(x),q) > , since the built-in BitInteger types are not large-enough > to cause the comparison issue with Float32: > If q is not a power of 2, then the count_ones part will be false, > else for the built-in BitInteger types, q.den is at most UInt128(2)^127. > This is strictly less than floatmax(Float32) . > When q.den is a power of 2 and less than floatmax(Float32), > q.den can be represented exactly and the only way for > multiplication by it to be inexact is exceeding the exponent range. > In such cases, the exact product is at least big"2"^128, so the Float product is Inf32, > and q is at most typemax(UInt128), since q.den is non-zero so q is finite. > typemax(UInt128) < big"2"^128 < Inf32 , so for Float32, > the x*q.den == q.num part still works even in these cases. > > (Those who are curious about _why_ I didn't agree to the Terms for the > other ways, can view my explanation at https://paste.ee/p/Y325uIJL .) Fixes #59622
2 parents 573db77 + 1af81e0 commit 81d46eb

File tree

2 files changed

+6
-1
lines changed

2 files changed

+6
-1
lines changed

base/rational.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -443,7 +443,7 @@ fma(x::Rational, y::Rational, z::Rational) = x*y+z
443443

444444
function ==(x::AbstractFloat, q::Rational)
445445
if isfinite(x)
446-
(count_ones(q.den) == 1) & (x*q.den == q.num)
446+
(count_ones(q.den) == 1) & (ldexp(x, top_set_bit(q.den-1)) == q.num)
447447
else
448448
x == q.num/q.den
449449
end

test/rational.jl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -852,3 +852,8 @@ end
852852
@test_throws OverflowError numerator(Int8(1)//Int8(31) + Int8(8)im//Int8(3))
853853
end
854854
end
855+
856+
@testset "Float16 comparison" begin
857+
@test Float16(6.0e-8) == big(1//16777216) == 1//16777216
858+
@test Float16(6.0e-8) == 1//16777216
859+
end

0 commit comments

Comments
 (0)