Skip to content

[API Proposal]: AVX-512 IFMA Intrinsics #96476

Open
@MineCake147E

Description

Background and motivation

AVX-512 IFMA is supported by Intel in the Cannon Lake and newer architectures, and by AMD in Zen 4.
These instructions are known to be useful for cryptography and large number processing, and as a faster compromised alternative for VPMULLQ instruction that finishes 5x slower on Intel CPUs compared to AMD Zen 4, as VPMADD52LUQ finishes in only 4 clock cycles.

API Proposal

namespace System.Runtime.Intrinsics.X86
{
    public abstract class Avx512Ifma : Avx512F
    {
        public static bool IsSupported { get; }
        public static Vector512<ulong> MultiplyAdd52Low(Vector512<ulong> a, Vector512<ulong> b, Vector512<ulong> c);
        public static Vector512<ulong> MultiplyAdd52High(Vector512<ulong> a, Vector512<ulong> b, Vector512<ulong> c);
        public abstract class VL : Avx512F.VL
        {
            public static new bool IsSupported { get; }
            public static Vector256<ulong> MultiplyAdd52Low(Vector256<ulong> a, Vector256<ulong> b, Vector256<ulong> c);
            public static Vector256<ulong> MultiplyAdd52High(Vector256<ulong> a, Vector256<ulong> b, Vector256<ulong> c);
            public static Vector128<ulong> MultiplyAdd52Low(Vector128<ulong> a, Vector128<ulong> b, Vector128<ulong> c);
            public static Vector128<ulong> MultiplyAdd52High(Vector128<ulong> a, Vector128<ulong> b, Vector128<ulong> c);
        }
    }
}

API Usage

zmm0 = Avx512Ifma.MultiplyAdd52Low(zmm0, zmm2, zmm3);
zmm1 = Avx512Ifma.MultiplyAdd52High(zmm1, zmm2, zmm3);

An example of vectorized Montgomery reduction implementations using the equivalent C++ intrinsics:

https://github.com/intel/hexl/blob/2d196fdd71f24511bd7e0e23dc07d37c888f53e7/hexl/util/avx512-util.hpp#L384-L411

Alternative Designs

Risks

None

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions