-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector128.ShuffleUnsafe #88559
Vector128.ShuffleUnsafe #88559
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,8 @@ | |
using System.Numerics; | ||
using System.Runtime.CompilerServices; | ||
using System.Runtime.InteropServices; | ||
using System.Runtime.Intrinsics.Arm; | ||
using System.Runtime.Intrinsics.Wasm; | ||
using System.Runtime.Intrinsics.X86; | ||
|
||
namespace System.Runtime.Intrinsics | ||
|
@@ -2351,6 +2353,45 @@ public static Vector256<sbyte> Shuffle(Vector256<sbyte> vector, Vector256<sbyte> | |
return result; | ||
} | ||
|
||
/// <summary>Creates a new vector by selecting values from an input vector using a set of indices. | ||
/// Behavior is platform-dependent for out-of-range indices.</summary> | ||
/// <param name="vector">The input vector from which values are selected.</param> | ||
/// <param name="indices">The per-element indices used to select a value from <paramref name="vector" />.</param> | ||
/// <returns>A new vector containing the values from <paramref name="vector" /> selected by the given <paramref name="indices" />.</returns> | ||
/// <remarks>Unlike Shuffle, this method delegates to the underlying hardware intrinsic without ensuring that <paramref name="indices"/> are normalized to [0, 31]. | ||
/// On hardware with <see cref="Avx2"/> support, indices are treated as modulo 32, and if the high bit is set, the result will be set to 0 for that element. | ||
/// On hardware with <see cref="AdvSimd.Arm64"/> support, this method behaves the same as Shuffle.</remarks> | ||
/// On hardware with <see cref="PackedSimd"/> support, indices are treated as modulo 32. | ||
Comment on lines
+2361
to
+2364
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: This remarks is confusing.
We can probably exclude the Arm64/Wasm callouts for V256/V512 as they aren't important. They can be combined into a general statement that other platforms will treat this the same as |
||
[MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
[CompExactlyDependsOn(typeof(Avx2))] | ||
[CompExactlyDependsOn(typeof(AdvSimd))] | ||
[CompExactlyDependsOn(typeof(AdvSimd.Arm64))] | ||
[CompExactlyDependsOn(typeof(PackedSimd))] | ||
public static Vector256<byte> ShuffleUnsafe(Vector256<byte> vector, Vector256<byte> indices) | ||
{ | ||
if (Avx2.IsSupported) | ||
{ | ||
return Avx2.Shuffle(vector, indices); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These hardware intrinsics have different semantics than Vector256/512 Shuffles. Avx2.Shuffle is effectively two Vector128.Shuffles on each half. The indices are still [0, 15]. Similar for Avx512. Vector256.Shuffle on the other hand works with the whole vector and allows you to pick and index [0, 31]. |
||
} | ||
|
||
if (AdvSimd.Arm64.IsSupported) | ||
{ | ||
(Vector128<byte>, Vector128<byte>) table = (vector.GetLower(), vector.GetUpper()); | ||
return Vector256.Create( | ||
AdvSimd.Arm64.VectorTableLookup(table, indices.GetLower()), | ||
AdvSimd.Arm64.VectorTableLookup(table, indices.GetUpper())); | ||
} | ||
|
||
if (PackedSimd.IsSupported) | ||
{ | ||
return Vector256.Create( | ||
PackedSimd.Shuffle(vector.GetLower(), vector.GetUpper(), indices.GetLower()), | ||
PackedSimd.Shuffle(vector.GetLower(), vector.GetUpper(), indices.GetUpper())); | ||
} | ||
Comment on lines
+2377
to
+2390
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't need this additional handling for It's just causing the JIT to spend more time doing dead code elimination, managing the inliner budget, etc. |
||
|
||
return Shuffle(vector, indices); | ||
} | ||
|
||
/// <summary>Creates a new vector by selecting values from an input vector using a set of indices.</summary> | ||
/// <param name="vector">The input vector from which values are selected.</param> | ||
/// <param name="indices">The per-element indices used to select a value from <paramref name="vector" />.</param> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anywhere else this should be consumed in dotnet/runtime as part of making it public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I updated all such places when adding the internal helper as part of #80963
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I know you updated places in corelib. I was wondering if there were places outside of corelib you couldn't update. Sounds like no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two places I saw where we still use the platform-specific instructions directly (BitArray and OptimizedInboxTextEncoder), but both of those are in larger platform-specific blocks of logic, so there was no reason to replace them.