Closed
Description
Background and motivation
Constant vectors quite often have the same values in 128-bit lanes, it makes them especially verbose with AVX512, consider these: https://github.com/dotnet/runtime/blob/d29d4d04d20252c283b76fa50f04b6ebf5dc9d91/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs#L665-L681 it'd be nice to have a Create API that can automatically broadcasts lanes to keep code smaller (and data section's size too - but that is irrelevant here)
API Proposal
namespace System.Runtime.Intrinsics
{
public static partial class Vector128
{
public static Vector128<T> Create<T>(Vector64<T> value) where T : struct;
}
public static partial class Vector256
{
public static Vector256<T> Create<T>(Vector128<T> value) where T : struct;
}
public static partial class Vector512
{
public static Vector512<T> Create<T>(Vector128<T> value) where T : struct;
public static Vector512<T> Create<T>(Vector256<T> value) where T : struct;
}
}
API Usage
Vector512<sbyte> lutShift = Vector512.Create(
0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0);
// becomes:
Vector512<sbyte> lutShift = Vector512.Create(
Vector128.Create(0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0));
Alternative Designs
No strong opinion on Vector64
Risks
No response