-
Notifications
You must be signed in to change notification settings - Fork 5k
Add Support for Vector512 in BinaryPrimitives.ReverseEndianness (#108204) #108205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@dotnet-policy-service agree |
@EgorBot -intel -perf using System.Buffers.Binary;
using BenchmarkDotNet.Attributes;
public class Bench
{
int[] arr1 = new int[1024];
int[] arr2 = new int[1024];
[Benchmark]
public void Reverse() =>
BinaryPrimitives.ReverseEndianness(arr1, arr2);
} |
@EgorBot -intel -aws_amd -perf using System.Buffers.Binary;
using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;
public unsafe class Bench
{
static void* src;
static void* dst;
[GlobalSetup]
public void Setup()
{
src = NativeMemory.AlignedAlloc(1024 * 4, 64);
dst = NativeMemory.AlignedAlloc(1024 * 4, 64);
}
[GlobalCleanup]
public void Cleanup()
{
NativeMemory.AlignedFree(src);
NativeMemory.AlignedFree(dst);
}
[Benchmark]
public void Reverse() =>
BinaryPrimitives.ReverseEndianness(
new ReadOnlySpan<int>(src, 1024),
new Span<int>(dst, 1024));
} |
So, as the benchmarks state, the Avx512 acceleration for such a trivial routine (load->store) only makes sense when data is aligned. Data is 64 bytes aligned (~2x improvement as expected):Data is 8 bytes aligned (regressed by ~30%):so this change has to make sure data is aligned before doing the main loop. |
This is pending the cleanup suggested by Egor above to ensure we don't regress performance unnecessarily. More ideally this would be rewritten to use the internal |
Should I close this PR then? |
Fixes #108204. I add support Vector512 in BinaryPrimitives.ReverseEndianness.