✨ support f16 + 🧹 some minor refactoring #1
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR does the following;
half
package)P.S.
f16
is supported through converting it to (what I call)i16ord
- which is an ordinal (i.e., monotonic) mapping off16
toi16
;The ord transformation:
(to apply this on a
f16
, just transmutef16
toi16
first)Some useful properties of this transformation
i16
(SIMD) instructions for comparison.i16
(ord) values - transform the outcome back tof16
without needing a lookup table.=> these operations can easily implemented in SIMD instructions 🎉
Visualization of the transformation
Illustration of
ord_transform
on all possiblefloat16
numbers.You can observe the montonic rising slope 🥳
Illustration of the symmetry propetry.
When applying the
ord_transform
twice on the same value, we get back the original value!!Limitations
NaN
values andinf
s are not supported in this transformationBenchmarks
The
f16
support that leverages theord_transform
:f16
SIMD ~ 2x faster thanf32
SIMD 🔥f16
scalar ~ 1.25x slower thanf32
scalar (:face_exhaling:)f32
uses) onhalf::f16
f32
upcasting (i.e., replacingord_transform
withto_f32
in the implementation)