This repository was archived by the owner on Dec 22, 2021. It is now read-only.
This repository was archived by the owner on Dec 22, 2021. It is now read-only.
f32x4 = roundXX(f32x4)? #177
Closed
Description
I'm surprised to see the various f32x4 = roundXX(f32x4) functions specified by IEEE not included yet.
Was there some prior discussion about them, or just no need so far?
round (roundToIntegralTiesToEven)
ceil (roundToIntegralTowardPositive)
floor (roundToIntegralTowardNegative)
trunc (roundToIntegralTowardZero)
JPEG XL uses round() for quantization - we find it useful to keep the values in floating-point rather than round while converting to integer. This allows a subsequent FP subtract without (fairly expensive) conversions FP -> int -> FP. I also see uses in Embree and GLM.
These operations are widely supported: on x86, we have _mm_round_ps; on ARM it's vrndn; on PPC it's xvrspi*. Are there any concerns about including them?