Description
When optimizations are enabled, we miscompile this function on aarch64 (definitely aarch64-apple-darwin
, but linux seems impacted too):
// https://rust.godbolt.org/z/YMbEWMvao
pub fn clamp0(f: f32) -> f32 {
// Carefully written check that maps `NaN` to `0.0`
if !(f > 0.0) { 0.0 } else { f }
}
This function is carefully written so that NaN
inputs are mapped to zero1. Sadly, the way we compile it, this check only works for quiet NaN -- signalling NaN are returned verbatim. For example, clamp0(black_box(f32::from_bits(0x7f800001)))
incorrectly returns its input, a NaN
. (The black_box
is only needed because we get it right if it's evaluated via const-propagatation).
The generated code seems to use the fmaxnm
instruction, which is based on IEEE-754 2008's maxNum function. This function is wrong here for signaling NaN, since it returns NaN for signalling NaN arguments.
While experimenting, I noticed that when f > 0.0
is changed to f >= 0.0
the miscompilation no longer happens. That said, the semantics change too2.
I think this is an LLVM issue, but it's also possible that we're telling LLVM something that is incorrect for signalling NaN. I don't know which is the case.
This could cause real problems in the wild. I've written code before which passes the result of similarly "clever" clamp implementations to to_int_unchecked
(UB, but likely harmless3 and only because of our miscompilation) before using them as an unchecked array index... which now turns that "harmless" UB into an out of bounds access4 (not really harmless). So I think this is a potentially serious issue.
Meta
rustc --version --verbose
:
rustc 1.70.0-nightly (cf7ada217 2023-04-03)
binary: rustc
commit-hash: cf7ada217c8ac63367b184afd9fffaff30f6ed44
commit-date: 2023-04-03
host: aarch64-apple-darwin
release: 1.70.0-nightly
LLVM version: 16.0.0
This has happened at least back to 1.55.0-aarch64-apple-darwin
, so I think we've always been broken here. The fact that it only happens for sNaN (rare) and doesn't happen if a >=
condition is used probably explains why nobody noticed it -- I only noticed when doing exhaustive testing of some larger function (clamp0
is minimized) for all f32
s.
Footnotes
-
Recall that all comparisons with
NaN
return false, so iff
is NaN thenf > 0.0
will be false. ↩ -
With
if !(f > 0.0) { 0.0 } else { f }
,clamp0(-0.0)
will become a positive zero, which is not the case forif !(f >= 0.0) { 0.0 } else { f }
. ↩ -
We lived with the UB being possible in all
f as int
casts for a very long time. ↩ -
Such code is spooky, but it is often easy to verify it behaves properly for every possible f32 input. ↩