-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Vectorize TensorPrimitives.CosineSimilarity<Half> #116898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Vectorize for Half by processing it as shorts, using the existing widening routine to two vectors of floats, and operating on those floats. Even for non-vectorized, this improves throughput as each intermediate operation is operating on floats rather than constantly needing to convert back to Half.
Tagging subscribers to this area: @dotnet/area-system-numerics-tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds explicit vectorization support for Half
inputs in TensorPrimitives.CosineSimilarity
, refactors the core implementation to use common Update
/Finalize
helpers, and introduces a specialized CosineSimilarityHalfCore
that processes Half
as widened floats.
- Adds a generic wrapper for
CosineSimilarity<T>
that dispatches to a newHalf
-specific path - Refactors existing vector‐and‐scalar loops into shared
Update
andFinalize
methods - Implements
CosineSimilarityHalfCore
with 128/256/512-bit vector and scalar fallbacks forHalf
Comments suppressed due to low confidence (2)
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs:184
- A new specialized path for
Half
has been added but no tests forTensorPrimitives.CosineSimilarity
onHalf
arrays appear in this PR. Please add unit tests covering both vectorized and scalar code paths to validate correctness.
private static Half CosineSimilarityHalfCore(ReadOnlySpan<Half> x, ReadOnlySpan<Half> y)
src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs:31
- The XML doc for
CosineSimilarity<T>
does not mention the newHalf
-specialized path. Please update the summary to note thatHalf
inputs are now vectorized viaHalf
⇒short
⇒float
widening.
public static T CosineSimilarity<T>(ReadOnlySpan<T> x, ReadOnlySpan<T> y)
...em.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs
Show resolved
Hide resolved
...em.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs
Show resolved
Hide resolved
...em.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
It's a bit unfortunate we need to duplicate the CosineSimilarityCore
function here. I expect we could have a general operate with m-to-n intermediate
helper, but that would be a larger refactoring (and I don't think it's worth blocking this on making that happen).
I have such a helper in another PR I'll put up for other methods, but applying it to CosineSimliarity (which doesn't use any of the shared helpers or operators) results in roundtripping between Half and float for each operation, which is measurably worse than staying with float as the accumulator. We can subsequently look at a larger refactoring around our aggregations to enable a) making the accumulation configurable and b) getting CosineSimilarity onto the same helpers (which is desirable, anyway, as it's not currently as robust in its optimizations as the shared helpers are). |
Vectorize for Half by processing it as shorts, using the existing widening routine to two vectors of floats, and operating on those floats. Even for non-vectorized, this improves throughput as each intermediate operation is operating on floats rather than constantly needing to convert back to Half.
Before:
After