Skip to content

Add Support for Vector512 in BinaryPrimitives.ReverseEndianness (#108204) #108205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

masterworgen
Copy link

@masterworgen masterworgen commented Sep 24, 2024

Fixes #108204. I add support Vector512 in BinaryPrimitives.ReverseEndianness.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 24, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Sep 24, 2024
@masterworgen
Copy link
Author

@dotnet-policy-service agree

@dotnet-policy-service agree

@huoyaoyuan huoyaoyuan added area-System.Memory and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Sep 24, 2024
@EgorBo
Copy link
Member

EgorBo commented Sep 24, 2024

@EgorBot -intel -perf

using System.Buffers.Binary;
using BenchmarkDotNet.Attributes;

public class Bench
{
    int[] arr1 = new int[1024];
    int[] arr2 = new int[1024];

    [Benchmark]
    public void Reverse() => 
        BinaryPrimitives.ReverseEndianness(arr1, arr2);
}

@EgorBo
Copy link
Member

EgorBo commented Sep 24, 2024

@EgorBot -intel -aws_amd -perf

using System.Buffers.Binary;
using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;

public unsafe class Bench
{
    static void* src;
    static void* dst;

    [GlobalSetup]
    public void Setup()
    {
        src = NativeMemory.AlignedAlloc(1024 * 4, 64);
        dst = NativeMemory.AlignedAlloc(1024 * 4, 64);
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        NativeMemory.AlignedFree(src);
        NativeMemory.AlignedFree(dst);
    }

    [Benchmark]
    public void Reverse() => 
        BinaryPrimitives.ReverseEndianness(
            new ReadOnlySpan<int>(src, 1024), 
            new Span<int>(dst, 1024));
}

@EgorBo
Copy link
Member

EgorBo commented Sep 24, 2024

So, as the benchmarks state, the Avx512 acceleration for such a trivial routine (load->store) only makes sense when data is aligned.

Data is 64 bytes aligned (~2x improvement as expected):

image

Data is 8 bytes aligned (regressed by ~30%):

image

so this change has to make sure data is aligned before doing the main loop.

@tannergooding tannergooding added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jan 10, 2025
@tannergooding
Copy link
Member

This is pending the cleanup suggested by Egor above to ensure we don't regress performance unnecessarily.

More ideally this would be rewritten to use the internal ISimdVector and potentially have a setup closer to what TensorPrimitives uses internally, which handles opportunistic alignment, large and small inputs, etc. However, that may be work that is better done more holistically and not incrementally, picking up and sharing the existing logic that TensorPrimitives has done; so may be better driven explicitly by the .NET team when we're ready to complete such work.

@masterworgen
Copy link
Author

This is pending the cleanup suggested by Egor above to ensure we don't regress performance unnecessarily.

More ideally this would be rewritten to use the internal ISimdVector and potentially have a setup closer to what TensorPrimitives uses internally, which handles opportunistic alignment, large and small inputs, etc. However, that may be work that is better done more holistically and not incrementally, picking up and sharing the existing logic that TensorPrimitives has done; so may be better driven explicitly by the .NET team when we're ready to complete such work.

Should I close this PR then?

@dotnet-policy-service dotnet-policy-service bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Memory community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Support for Vector512 in BinaryPrimitives.ReverseEndianness.cs
4 participants