Skip to content

_mm512_reduce_add_ps and friends are setting fast-math flags they should not set #1533

Closed
@RalfJung

Description

@RalfJung

Today I learned about the existence of the simd_reduce_add_unordered intrinsic. When called on a float, this compiles to LLVM's vector.reduce.fadd with the "fast" flag set, which means that passing in NAN or INF is UB and optimizations are allowed "to treat the sign of a zero argument or zero result as insignificant" (which I think means the sign of input zeros is non-deterministically swapped and returned zeros have non-deterministic sign).

This intrinsic is not used a lot in stdarch, but it has a total of 8 uses (all in avx512f.rs). 4 of these are integer intrinsics, where this should be entirely equivalent to simd_reduce_add; not sure why the "unordered" version is used. The other 4 are float intrinsics, _mm512_reduce_add_ps being the first:

https://github.com/rust-lang/stdarch/blob/4d9c0bb591336792c4c4baf293d0acc944e57e28/crates/core_arch/src/x86/avx512f.rs#L31262-L31270

/// Reduce the packed single-precision (32-bit) floating-point elements in a by addition. Returns the sum of all elements in a.
///
/// [Intel's documentation](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm512_reduce_add_ps&expand=4562)
#[inline]
#[target_feature(enable = "avx512f")]
#[unstable(feature = "stdarch_x86_avx512", issue = "111137")]
pub unsafe fn _mm512_reduce_add_ps(a: __m512) -> f32 {
    simd_reduce_add_unordered(a.as_f32x16())
}

Neither the docs here nor Intel's docs mention that this is UB on NAN or INF, and the concerns around signed zeros and doing the addition in an unspecified order. Given that the Intel docs should be the authoritative docs (since this is a vendor intrinsic), why is it even correct to use fast-math flags here? Either the docs need to be updated to state the fast-math preconditions, or the implementation needs to be updated to avoid the fast-math flag. Maybe it should only use "reassoc", not the full but unsafe "fast" flag? But even that should probably be mentioned in the docs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions