-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Expose the x86 HWIntrinsics via a set of class hierarchies matching the underlying ISA hierarchies #19186
Conversation
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel AES hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Aes | |||
public abstract class Aes : Sse2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.13.4 Checking for AESNI Support
Before an application attempts to use AESNI instructions or PCLMULQDQ, the application should follow the steps
illustrated in Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports AESNI (if CPUID.01H:ECX.AESNI[bit 25] = 1); check that the processor
supports PCLMULQDQ (if CPUID.01H:ECX.PCLMULQDQ[bit 1] = 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that there is not a co-dependency between AESNI and PCLMULQDQ.
CC. @fiigii
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel AVX hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Avx | |||
public abstract class Avx : Sse42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14.3 DETECTION OF AVX INSTRUCTIONS
Intel AVX instructions operate on the 256-bit YMM register state. Application detection of new instruction extensions
operating on the YMM state follows the general procedural flow in Figure 14-2.
Prior to using AVX, the application must identify that the operating system supports the XGETBV instruction, the
YMM register state, in addition to processor’s support for YMM state management using XSAVE/XRSTOR and AVX
instructions. The following simplified sequence accomplishes both and is strongly recommended.
- Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use1)
- Issue XGETBV and verify that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled by OS).
- detect CPUID.1:ECX.AVX[bit 28] = 1 (AVX instructions supported).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that AVX has an implicit dependency on Sse42 (and prior) due to extending the existing instructions.
CC. @fiigii
@@ -11,9 +11,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel AVX2 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Avx2 | |||
public abstract class Avx2 : Avx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14.7.1 Detection of AVX2
Hardware support for AVX2 is indicated by CPUID.(EAX=07H, ECX=0H):EBX.AVX2[bit 5]=1.
Application Software must identify that hardware supports AVX, after that it must also detect support for AVX2 by
checking CPUID.(EAX=07H, ECX=0H):EBX.AVX2[bit 5].
@@ -455,7 +455,7 @@ public static class Avx2 | |||
/// __m128i _mm256_extracti128_si256 (__m256i a, const int imm8) | |||
/// VEXTRACTI128 m128, ymm, imm8 | |||
/// </summary> | |||
public static unsafe void ExtractVector128(sbyte* address, Vector256<sbyte> value, byte index) { throw new PlatformNotSupportedException(); } | |||
new public static unsafe void ExtractVector128(sbyte* address, Vector256<sbyte> value, byte index) { throw new PlatformNotSupportedException(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We expose both AVX and AVX2 versions of these functions...
The AVX version uses VEXTRACTF128
(@fiigii, does this incur any kind of stall?) while the AVX2 version uses VEXTRACTI128
.
We should determine if we want to:
- Expose as is
- Expose only on AVX and document that it will use
VEXTRACTI128
on AVX2 hardware - Don't expose these overloads on AVX, and require users reinterpret-cast to
float
ordouble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this incur any kind of stall?
Yes, but just a little bit. Usually, that is just one cycle.
@@ -11,7 +11,7 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel BMI1 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Bmi1 | |||
public abstract class Bmi1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5.1.16.1 Detection of VEX-encoded GPR Instructions, LZCNT and TZCNT, PREFETCHW
VEX-encoded general-purpose instructions do not operate on any vector registers.
There are separate feature flags for the following subsets of instructions that operate on general purpose registers,
and the detection requirements for hardware support are:
CPUID.(EAX=07H, ECX=0H):EBX.BMI1[bit 3]: if 1 indicates the processor supports the first group of advanced bit
manipulation extensions (ANDN, BEXTR, BLSI, BLSMSK, BLSR, TZCNT);
CPUID.(EAX=07H, ECX=0H):EBX.BMI2[bit 8]: if 1 indicates the processor supports the second group of advanced
bit manipulation extensions (BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX);
CPUID.EAX=80000001H:ECX.LZCNT[bit 5]: if 1 indicates the processor supports the LZCNT instruction.
CPUID.EAX=80000001H:ECX.PREFTEHCHW[bit 8]: if 1 indicates the processor supports the PREFTEHCHW instruction.
CPUID.(EAX=07H, ECX=0H):ECX.PREFTEHCHWT1[bit 0]: if 1 indicates the processor supports the
PREFTEHCHWT1 instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel is following up to confirm that Bmi1
and Bmi2
, despite using the VEX-encoding for several instructions, does not have a dependency on Avx
. The Celeron G4920 is one such processor that appears to support BMI1/BMI2, but not AVX.
CC. @fiigii
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still waiting for the hardware to experiment, sorry for the delay.
@@ -11,7 +11,7 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel BMI2 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Bmi2 | |||
public abstract class Bmi2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5.1.16.1 Detection of VEX-encoded GPR Instructions, LZCNT and TZCNT, PREFETCHW
VEX-encoded general-purpose instructions do not operate on any vector registers.
There are separate feature flags for the following subsets of instructions that operate on general purpose registers,
and the detection requirements for hardware support are:
CPUID.(EAX=07H, ECX=0H):EBX.BMI1[bit 3]: if 1 indicates the processor supports the first group of advanced bit
manipulation extensions (ANDN, BEXTR, BLSI, BLSMSK, BLSR, TZCNT);
CPUID.(EAX=07H, ECX=0H):EBX.BMI2[bit 8]: if 1 indicates the processor supports the second group of advanced
bit manipulation extensions (BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX);
CPUID.EAX=80000001H:ECX.LZCNT[bit 5]: if 1 indicates the processor supports the LZCNT instruction.
CPUID.EAX=80000001H:ECX.PREFTEHCHW[bit 8]: if 1 indicates the processor supports the PREFTEHCHW instruction.
CPUID.(EAX=07H, ECX=0H):ECX.PREFTEHCHWT1[bit 0]: if 1 indicates the processor supports the
PREFTEHCHWT1 instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel is following up to confirm that Bmi1 and Bmi2, despite using the VEX-encoding for several instructions, does not have a dependency on Avx. The Celeron G4920 is one such processor that appears to support BMI1/BMI2, but not AVX.
CC. @fiigii
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the names, there is no dependency for Bmi2
on Bmi1
@@ -10,9 +10,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// <summary> | |||
/// This class provides access to Intel FMA hardware instructions via intrinsics | |||
/// </summary> | |||
public static class Fma | |||
public abstract class Fma : Avx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14.5.3 Detection of FMA
Hardware support for FMA is indicated by CPUID.1:ECX.FMA[bit 12]=1.
Application Software must identify that hardware supports AVX, after that it must also detect support for FMA by
CPUID.1:ECX.FMA[bit 12].
@@ -11,7 +11,7 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel LZCNT hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Lzcnt | |||
public abstract class Lzcnt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5.1.16.1 Detection of VEX-encoded GPR Instructions, LZCNT and TZCNT, PREFETCHW
VEX-encoded general-purpose instructions do not operate on any vector registers.
There are separate feature flags for the following subsets of instructions that operate on general purpose registers,
and the detection requirements for hardware support are:
CPUID.(EAX=07H, ECX=0H):EBX.BMI1[bit 3]: if 1 indicates the processor supports the first group of advanced bit
manipulation extensions (ANDN, BEXTR, BLSI, BLSMSK, BLSR, TZCNT);
CPUID.(EAX=07H, ECX=0H):EBX.BMI2[bit 8]: if 1 indicates the processor supports the second group of advanced
bit manipulation extensions (BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX);
CPUID.EAX=80000001H:ECX.LZCNT[bit 5]: if 1 indicates the processor supports the LZCNT instruction.
CPUID.EAX=80000001H:ECX.PREFTEHCHW[bit 8]: if 1 indicates the processor supports the PREFTEHCHW instruction.
CPUID.(EAX=07H, ECX=0H):ECX.PREFTEHCHWT1[bit 0]: if 1 indicates the processor supports the
PREFTEHCHWT1 instruction.
@@ -11,9 +11,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel PCLMULQDQ hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Pclmulqdq | |||
public abstract class Pclmulqdq : Sse2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.13.4 Checking for AESNI Support
Before an application attempts to use AESNI instructions or PCLMULQDQ, the application should follow the steps
illustrated in Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports AESNI (if CPUID.01H:ECX.AESNI[bit 25] = 1); check that the processor
supports PCLMULQDQ (if CPUID.01H:ECX.PCLMULQDQ[bit 1] = 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that there is not a co-dependency between AESNI and PCLMULQDQ.
CC. @fiigii
@@ -10,9 +10,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel POPCNT hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Popcnt | |||
public abstract class Popcnt : Sse42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.12.3 Checking for SSE4.2 Support
Before an application attempts to use the following SSE4.2 instructions: PCMPESTRI/PCMPESTRM/PCMPISTRI/
PCMPISTRM, PCMPGTQ;the application should follow the steps illustrated in Section 11.6.2, “Checking for
SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports SSE4.2 (if CPUID.01H:ECX.SSE4_2[bit 20] = 1), SSE4.1 (if
CPUID.01H:ECX.SSE4_1[bit 19] = 1), and SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1).
Before an application attempts to use the CRC32 instruction, it must check that the processor supports SSE4.2 (if
CPUID.01H:ECX.SSE4_2[bit 20] = 1).
Before an application attempts to use the POPCNT instruction, it must check that the processor supports SSE4.2 (if
CPUID.01H:ECX.SSE4_2[bit 20] = 1) and POPCNT (if CPUID.01H:ECX.POPCNT[bit 23] = 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that the SSE4.2 dependency includes SSE-SSE4.1.
CC. @fiigii
@@ -11,7 +11,7 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse | |||
public abstract class Sse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11.6.2 Checking for SSE/SSE2 Support
Before an application attempts to use the SSE and/or SSE2 extensions, it should check that they are present on the
processor:
-
Check that the processor supports the CPUID instruction. Bit 21 of the EFLAGS register can be used to check
processor’s support the CPUID instruction. -
Check that the processor supports the SSE and/or SSE2 extensions (true if CPUID.01H:EDX.SSE[bit 25] = 1
and/or CPUID.01H:EDX.SSE2[bit 26] = 1).
Operating system must provide system level support for handling SSE state, exceptions before an application can
use the SSE and/or SSE2 extensions (see Chapter 13 in the Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Volume 3A).
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE2 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse2 | |||
public abstract class Sse2 : Sse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
11.6.2 Checking for SSE/SSE2 Support
Before an application attempts to use the SSE and/or SSE2 extensions, it should check that they are present on the
processor:
-
Check that the processor supports the CPUID instruction. Bit 21 of the EFLAGS register can be used to check
processor’s support the CPUID instruction. -
Check that the processor supports the SSE and/or SSE2 extensions (true if CPUID.01H:EDX.SSE[bit 25] = 1
and/or CPUID.01H:EDX.SSE2[bit 26] = 1).
Operating system must provide system level support for handling SSE state, exceptions before an application can
use the SSE and/or SSE2 extensions (see Chapter 13 in the Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Volume 3A).
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE3 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse3 | |||
public abstract class Sse3 : Sse2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.4.2 Checking for SSE3 Support
Before an application attempts to use the SIMD subset of SSE3 extensions, the application should follow the steps
illustrated in Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
• Check that the processor supports the SIMD and x87 SSE3 extensions (if CPUID.01H:ECX.SSE3[bit 0] = 1).
An operating systems that provides application support for SSE, SSE2 also provides sufficient application support
for SSE3. To use FISTTP, software only needs to check support for SSE3.
In the initial implementation of MONITOR and MWAIT, these two instructions are available to ring 0 and conditionally
available at ring level greater than 0. Before an application attempts to use the MONITOR and MWAIT instructions,
the application should use the following steps:
- Check that the processor supports MONITOR and MWAIT. If CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR
and MWAIT are available at ring 0. - Query the smallest and largest line size that MONITOR uses. Use CPUID.05H:EAX.smallest[bits
15:0];EBX.largest[bits15:0]. Values are returned in bytes in EAX and EBX. - Ensure the memory address range(s) that will be supplied to MONITOR meets memory type requirements.
MONITOR and MWAIT are targeted for system software that supports efficient thread synchronization, See Chapter
13 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A for details.
@@ -11,9 +11,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE4.1 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse41 | |||
public abstract class Sse41 : Ssse3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.12.2 Checking for SSE4.1 Support
Before an application attempts to use SSE4.1 instructions, the application should follow the steps illustrated in
Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports SSE4.1 (if CPUID.01H:ECX.SSE4_1[bit 19] = 1), SSE3 (if
CPUID.01H:ECX.SSE3[bit 0] = 1), and SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1).
@@ -11,9 +11,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE4.2 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse42 | |||
public abstract class Sse42 : Sse41 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.12.3 Checking for SSE4.2 Support
Before an application attempts to use the following SSE4.2 instructions: PCMPESTRI/PCMPESTRM/PCMPISTRI/
PCMPISTRM, PCMPGTQ;the application should follow the steps illustrated in Section 11.6.2, “Checking for
SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports SSE4.2 (if CPUID.01H:ECX.SSE4_2[bit 20] = 1), SSE4.1 (if
CPUID.01H:ECX.SSE4_1[bit 19] = 1), and SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1).
Before an application attempts to use the CRC32 instruction, it must check that the processor supports SSE4.2 (if
CPUID.01H:ECX.SSE4_2[bit 20] = 1).
Before an application attempts to use the POPCNT instruction, it must check that the processor supports SSE4.2 (if
CPUID.01H:ECX.SSE4_2[bit 20] = 1) and POPCNT (if CPUID.01H:ECX.POPCNT[bit 23] = 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that Sse4.2
requires Ssse3
, despite that it is not explicitly listed.
CC. @fiigii
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSSE3 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Ssse3 | |||
public abstract class Ssse3 : Sse2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.7.2 Checking for SSSE3 Support
Before an application attempts to use the SSSE3 extensions, the application should follow the steps illustrated in
Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
• Check that the processor supports SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with Intel that Ssse3
requires Sse3
, despite that it is not explicitly listed.
CC. @fiigii
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12.7.1 Guidelines for Using SSSE3 Extensions
The following guidelines describe how to maximize the benefits of using SSSE3 extensions:
• Check that the processor supports SSSE3 extensions.
• Ensure that your operating system supports SSE/SSE2/SSE3/SSSE3 extensions. (Operating system support
for the SSE extensions implies sufficient support for SSE2, SSE3, and SSSE3.)
• Employ the optimization and scheduling techniques described in the Intel® 64 and IA-32 Architectures Optimization
Reference Manual (see Section 1.4, “Related Literature”).
The |
AVX-512 support, if/when it comes, is a good point to discuss now. It exposes multiple "levels", where subsequent levels depend on the "base" (AVX-512F) layer.
However, AVX-512F itself depends on both |
There is some additional API cleanup we should do, but that should be done separately as it is independent of making the intrinsic classes use inheritance. |
There is some additional work in the runtime (both VM and JIT) to ensure that the ISA checks are accurate (https://github.com/dotnet/coreclr/issues/18445). In the case of the JIT, the CC. @CarolEidt, @AndyAyersMS |
;TL;DR; This is a starting point for reasoning about the change in the proposal https://github.com/dotnet/corefx/issues/29247
if (SSE3.IsSupported)
{
var v1 = SSE3.SetAllVector128(0.5f); // comes from SSE
var v2 = SSE3.HorizontalAdd(v1, v1); // comes from SSE3
}
public class Sse
{ ...
}
public class Sse2 : Sse
{ ...
}
public class Sse3 : Sse2
{ ...
} IMHO we should look not from the perspective of elegant engineering but from perspective of devs using intrinsics. Having a possibility to do Otherwise the above work seems to me a bit of over engineering since assumptions made at the start of design process are not necessarily valid. |
@tannergooding Thanks for the work. However, I still doubt the necessity of ISA inherent design that introduces so much complexity (e.g., the runtime/compiler have to specially treat this |
I believe the |
Yes, this is what I was saying. The problem is that this is not obvious and it would be better to fix these flags now, before we ship 3.0 |
@fiigii, yes there were several other ideas passed around, but the API reviewers discussed the options presented and, ultimately, it was determined that we should move forward with this approach and see how it works out. This was because exposing the ISAs via the proper inheritance hierarchy addresses many of the root concerns that were raised. This approach means that:
|
@4creators, could you elaborate? What assumptions would no longer be valid here? |
@@ -11,10 +11,10 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel AES hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Aes | |||
public abstract class Aes : Sse2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make private
or internal
constructors on all of them as well? That way no one tries creating the class, or a class inheriting from this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
{ | ||
public static bool IsSupported { get => IsSupported; } | ||
|
||
new public static bool IsSupported { get => IsSupported; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe public new
is the "typical" order. See dotnet/machinelearning#557 (comment) for the same discussion from last week in dotnet/machinelearning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to match the IDE default.
@@ -11,7 +11,7 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel BMI2 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Bmi2 | |||
public abstract class Bmi2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(for my knowledge) Bmi2
shouldn't inherit from Bmi1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -11,9 +11,9 @@ namespace System.Runtime.Intrinsics.X86 | |||
/// This class provides access to Intel SSE4.1 hardware instructions via intrinsics | |||
/// </summary> | |||
[CLSCompliant(false)] | |||
public static class Sse41 | |||
public abstract class Sse41 : Ssse3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the place where things get a little hairy with inheritance, right?
Sse4.1
needs to check for both Sse3
and Ssse3
. But Ssse3
and Sse3
have no direct releationship. So really, we would need multiple inheritance here to correctly model this hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ssse3
has a direct dependency on Sse3
and this was confirmed with Intel. This is a case where the manual is not self-consistent (12.7.1 lists the Sse3 dependency, even though 12.7.2 does not).
The only case where multiple inheritance is currently needed is for AVX-512F (which has not been reviewed/etc). Because of a dependency on both AVX2 and FMA, where they are siblings (both inherit from AVX). -- I called this bit out here: #19186 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ssse3 has a direct dependency on Sse3
I'm not seeing that modeled here:
public abstract class Ssse3 : Sse2
{
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake on my end. Fixed. -- I also went through the other ISAs and ensures they looked correct.
Responded to all feedback so far. |
…he underlying ISA hierarchies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my end.
Thanks, @tannergooding.
@CarolEidt, are there any concerns on your end for this? |
LGTM - thanks for all the work, and the detailed notes. |
This implements https://github.com/dotnet/corefx/issues/29247 -- It was indicated that we would like to move forward with this.