Skip to content

Commit 363b059

Browse files
authored
LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero (#112852)
The documents claims that it ignores sNaN, while in the current code it may be different. - as the finally callback, it use libc call fmin(3)/fmax(3). while C23 clarifies that fmin(3)/fmax(3) should return NaN for sNaN vs NUM. - on some architectures, such as aarch64, it converts to `fmaxnm`, which returns qNaN for sNaN vs NUM. - on RISC-V (SPEC 2019+), it converts to `fmax`, which returns NUM for sNaN vs NUM. Since we have introduced llvm.minimumnum and llvm.maximumnum, which follow IEEE 754-2019's minimumNumber/maximumNumber. So, it's time for us to clarify llvm.minnum and llvm.maxnum. Since the final fallback of llvm.minnum and llvm.maxnum is fmin(3)/fmax(3), so that it is reasonable to follow the behaviors of fmin(3)/fmax(3). Although C23 clarified the behavior about sNaN and +0.0/-0.0: (NUM or NaN) vs sNaN -> qNaN +0.0 vs -0.0 -> either one of +0.0/-0.0 It is the same the IEEE754-2008's maxNUM and minNUM. Not all implementation work as expected. Since some architectures such as aarch64/MIPSr6/LoongArch, have instructions that implements +0.0>-0.0. So Let's define llvm.minnum and llvm.maxnum to IEEE754-2008 with +0.0>-0.0. The architectures without such instructions can implements `NSZ` flavor to speed up, and the frontend, such as clang, can call them with `nsz` attribute.
1 parent 354eb88 commit 363b059

File tree

2 files changed

+69
-54
lines changed

2 files changed

+69
-54
lines changed

llvm/docs/LangRef.rst

Lines changed: 54 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -16782,7 +16782,7 @@ versions of the intrinsics respect the exception behavior.
1678216782
- qNaN, invalid exception
1678316783

1678416784
* - ``+0.0 vs -0.0``
16785-
- either one
16785+
- +0.0(max)/-0.0(min)
1678616786
- +0.0(max)/-0.0(min)
1678716787
- +0.0(max)/-0.0(min)
1678816788

@@ -16826,21 +16826,30 @@ type.
1682616826

1682716827
Semantics:
1682816828
""""""""""
16829+
Follows the semantics of minNum in IEEE-754-2008, except that -0.0 < +0.0 for the purposes
16830+
of this intrinsic. As for signaling NaNs, per the minNum semantics, if either operand is sNaN,
16831+
the result is qNaN. This matches the recommended behavior for the libm
16832+
function ``fmin``, although not all implementations have implemented these recommended behaviors.
16833+
16834+
If either operand is a qNaN, returns the other non-NaN operand. Returns NaN only if both operands are
16835+
NaN or if either operand is sNaN. Note that arithmetic on an sNaN doesn't consistently produce a qNaN,
16836+
so arithmetic feeding into a minnum can produce inconsistent results. For example,
16837+
``minnum(fadd(sNaN, -0.0), 1.0)`` can produce qNaN or 1.0 depending on whether ``fadd`` is folded.
1682916838

16830-
Follows the IEEE-754 semantics for minNum, except for handling of
16831-
signaling NaNs. This match's the behavior of libm's fmin.
16839+
IEEE-754-2008 defines minNum, and it was removed in IEEE-754-2019. As the replacement, IEEE-754-2019
16840+
defines :ref:`minimumNumber <i_minimumnum>`.
1683216841

16833-
If either operand is a NaN, returns the other non-NaN operand. Returns
16834-
NaN only if both operands are NaN. If the operands compare equal,
16835-
returns either one of the operands. For example, this means that
16836-
fmin(+0.0, -0.0) returns either operand.
16842+
If the intrinsic is marked with the nsz attribute, then the effect is as in the definition in C
16843+
and IEEE-754-2008: the result of ``minnum(-0.0, +0.0)`` may be either -0.0 or +0.0.
1683716844

16838-
Unlike the IEEE-754 2008 behavior, this does not distinguish between
16839-
signaling and quiet NaN inputs. If a target's implementation follows
16840-
the standard and returns a quiet NaN if either input is a signaling
16841-
NaN, the intrinsic lowering is responsible for quieting the inputs to
16842-
correctly return the non-NaN input (e.g. by using the equivalent of
16843-
``llvm.canonicalize``).
16845+
Some architectures, such as ARMv8 (FMINNM), LoongArch (fmin), MIPSr6 (min.fmt), PowerPC/VSX (xsmindp),
16846+
have instructions that match these semantics exactly; thus it is quite simple for these architectures.
16847+
Some architectures have similiar ones while they are not exact equivalent. Such as x86 implements ``MINPS``,
16848+
which implements the semantics of C code ``a<b?a:b``: NUM vs qNaN always return qNaN. ``MINPS`` can be used
16849+
if ``nsz`` and ``nnan`` are given.
16850+
16851+
For existing libc implementations, the behaviors of fmin may be quite different on sNaN and signed zero behaviors,
16852+
even in the same release of a single libm implemention.
1684416853

1684516854
.. _i_maxnum:
1684616855

@@ -16877,20 +16886,30 @@ type.
1687716886

1687816887
Semantics:
1687916888
""""""""""
16880-
Follows the IEEE-754 semantics for maxNum except for the handling of
16881-
signaling NaNs. This matches the behavior of libm's fmax.
16889+
Follows the semantics of maxNum in IEEE-754-2008, except that -0.0 < +0.0 for the purposes
16890+
of this intrinsic. As for signaling NaNs, per the maxNum semantics, if either operand is sNaN,
16891+
the result is qNaN. This matches the recommended behavior for the libm
16892+
function ``fmax``, although not all implementations have implemented these recommended behaviors.
16893+
16894+
If either operand is a qNaN, returns the other non-NaN operand. Returns NaN only if both operands are
16895+
NaN or if either operand is sNaN. Note that arithmetic on an sNaN doesn't consistently produce a qNaN,
16896+
so arithmetic feeding into a maxnum can produce inconsistent results. For example,
16897+
``maxnum(fadd(sNaN, -0.0), 1.0)`` can produce qNaN or 1.0 depending on whether ``fadd`` is folded.
1688216898

16883-
If either operand is a NaN, returns the other non-NaN operand. Returns
16884-
NaN only if both operands are NaN. If the operands compare equal,
16885-
returns either one of the operands. For example, this means that
16886-
fmax(+0.0, -0.0) returns either -0.0 or 0.0.
16899+
IEEE-754-2008 defines maxNum, and it was removed in IEEE-754-2019. As the replacement, IEEE-754-2019
16900+
defines :ref:`maximumNumber <i_maximumnum>`.
1688716901

16888-
Unlike the IEEE-754 2008 behavior, this does not distinguish between
16889-
signaling and quiet NaN inputs. If a target's implementation follows
16890-
the standard and returns a quiet NaN if either input is a signaling
16891-
NaN, the intrinsic lowering is responsible for quieting the inputs to
16892-
correctly return the non-NaN input (e.g. by using the equivalent of
16893-
``llvm.canonicalize``).
16902+
If the intrinsic is marked with the nsz attribute, then the effect is as in the definition in C
16903+
and IEEE-754-2008: the result of maxnum(-0.0, +0.0) may be either -0.0 or +0.0.
16904+
16905+
Some architectures, such as ARMv8 (FMAXNM), LoongArch (fmax), MIPSr6 (max.fmt), PowerPC/VSX (xsmaxdp),
16906+
have instructions that match these semantics exactly; thus it is quite simple for these architectures.
16907+
Some architectures have similiar ones while they are not exact equivalent. Such as x86 implements ``MAXPS``,
16908+
which implements the semantics of C code ``a>b?a:b``: NUM vs qNaN always return qNaN. ``MAXPS`` can be used
16909+
if ``nsz`` and ``nnan`` are given.
16910+
16911+
For existing libc implementations, the behaviors of fmin may be quite different on sNaN and signed zero behaviors,
16912+
even in the same release of a single libm implemention.
1689416913

1689516914
.. _i_minimum:
1689616915

@@ -19769,12 +19788,8 @@ The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
1976919788
matches the element-type of the vector input.
1977019789

1977119790
This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
19772-
intrinsic. That is, the result will always be a number unless all elements of
19773-
the vector are NaN. For a vector with maximum element magnitude 0.0 and
19774-
containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19775-
19776-
If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19777-
assume that NaNs are not present in the input vector.
19791+
intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
19792+
operation can assume that NaNs are not present in the input vector.
1977819793

1977919794
Arguments:
1978019795
""""""""""
@@ -19802,12 +19817,8 @@ The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
1980219817
matches the element-type of the vector input.
1980319818

1980419819
This instruction has the same comparison semantics as the '``llvm.minnum.*``'
19805-
intrinsic. That is, the result will always be a number unless all elements of
19806-
the vector are NaN. For a vector with minimum element magnitude 0.0 and
19807-
containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
19808-
19809-
If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
19810-
assume that NaNs are not present in the input vector.
19820+
intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
19821+
operation can assume that NaNs are not present in the input vector.
1981119822

1981219823
Arguments:
1981319824
""""""""""
@@ -22086,7 +22097,7 @@ This is an overloaded intrinsic.
2208622097
Overview:
2208722098
"""""""""
2208822099

22089-
Predicated floating-point IEEE-754 minNum of two vectors of floating-point values.
22100+
Predicated floating-point IEEE-754-2008 minNum of two vectors of floating-point values.
2209022101

2209122102

2209222103
Arguments:
@@ -22135,7 +22146,7 @@ This is an overloaded intrinsic.
2213522146
Overview:
2213622147
"""""""""
2213722148

22138-
Predicated floating-point IEEE-754 maxNum of two vectors of floating-point values.
22149+
Predicated floating-point IEEE-754-2008 maxNum of two vectors of floating-point values.
2213922150

2214022151

2214122152
Arguments:
@@ -23434,10 +23445,7 @@ result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
2343423445

2343523446
This instruction has the same comparison semantics as the
2343623447
:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
23437-
'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
23438-
unless all elements of the vector and the starting value are ``NaN``. For a
23439-
vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23440-
``-0.0`` elements, the sign of the result is unspecified.
23448+
'``llvm.maxnum.*``' intrinsic).
2344123449

2344223450
To ignore the start value, the neutral value can be used.
2344323451

@@ -23504,10 +23512,7 @@ result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
2350423512

2350523513
This instruction has the same comparison semantics as the
2350623514
:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
23507-
'``llvm.minnum.*``' intrinsic). That is, the result will always be a number
23508-
unless all elements of the vector and the starting value are ``NaN``. For a
23509-
vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
23510-
``-0.0`` elements, the sign of the result is unspecified.
23515+
'``llvm.minnum.*``' intrinsic).
2351123516

2351223517
To ignore the start value, the neutral value can be used.
2351323518

@@ -28179,7 +28184,7 @@ The third argument specifies the exception behavior as described above.
2817928184
Semantics:
2818028185
""""""""""
2818128186

28182-
This function follows the IEEE-754 semantics for maxNum.
28187+
This function follows the IEEE-754-2008 semantics for maxNum.
2818328188

2818428189

2818528190
'``llvm.experimental.constrained.minnum``' Intrinsic
@@ -28211,7 +28216,7 @@ The third argument specifies the exception behavior as described above.
2821128216
Semantics:
2821228217
""""""""""
2821328218

28214-
This function follows the IEEE-754 semantics for minNum.
28219+
This function follows the IEEE-754-2008 semantics for minNum.
2821528220

2821628221

2821728222
'``llvm.experimental.constrained.maximum``' Intrinsic

llvm/include/llvm/CodeGen/ISDOpcodes.h

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1021,13 +1021,20 @@ enum NodeType {
10211021
LRINT,
10221022
LLRINT,
10231023

1024-
/// FMINNUM/FMAXNUM - Perform floating-point minimum or maximum on two
1025-
/// values.
1024+
/// FMINNUM/FMAXNUM - Perform floating-point minimum maximum on two values,
1025+
/// following IEEE-754 definitions except for signed zero behavior.
10261026
///
1027-
/// In the case where a single input is a NaN (either signaling or quiet),
1028-
/// the non-NaN input is returned.
1027+
/// If one input is a signaling NaN, returns a quiet NaN. This matches
1028+
/// IEEE-754 2008's minNum/maxNum behavior for signaling NaNs (which differs
1029+
/// from 2019).
10291030
///
1030-
/// The return value of (FMINNUM 0.0, -0.0) could be either 0.0 or -0.0.
1031+
/// These treat -0 as ordered less than +0, matching the behavior of IEEE-754
1032+
/// 2019's minimumNumber/maximumNumber.
1033+
///
1034+
/// Note that that arithmetic on an sNaN doesn't consistently produce a qNaN,
1035+
/// so arithmetic feeding into a minnum/maxnum can produce inconsistent
1036+
/// results. FMAXIMUN/FMINIMUM or FMAXIMUMNUM/FMINIMUMNUM may be better choice
1037+
/// for non-distinction of sNaN/qNaN handling.
10311038
FMINNUM,
10321039
FMAXNUM,
10331040

@@ -1041,6 +1048,9 @@ enum NodeType {
10411048
///
10421049
/// These treat -0 as ordered less than +0, matching the behavior of IEEE-754
10431050
/// 2019's minimumNumber/maximumNumber.
1051+
///
1052+
/// Deprecated, and will be removed soon, as FMINNUM/FMAXNUM have the same
1053+
/// semantics now.
10441054
FMINNUM_IEEE,
10451055
FMAXNUM_IEEE,
10461056

0 commit comments

Comments
 (0)