Closed
Description
Now that all the SVE instructions encoding is completed in #94549, it is time to expose these instructions through .NET APIs. Here is the list of categorized APIs with links to the issue where they were approved.
.NET 9 Goal: We aim to complete SVE APIs in .NET 9. SVE2 APIs will be pushed out to .NET 10.
SVE APIs
High Priority SVE APIs
Sve mask (Complete)
Full list
- AbsoluteCompareGreaterThan Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- AbsoluteCompareGreaterThanOrEqual Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- AbsoluteCompareLessThan Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- AbsoluteCompareLessThanOrEqual Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- Compact JIT ARM64-SVE: Add Compact API #102992
- CompareEqual Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareGreaterThan Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareGreaterThanOrEqual Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareLessThan Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareLessThanOrEqual Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareNotEqualTo Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- CompareUnordered Arm64/Sve: Implement AbsoluteCompare* and Compare* APIs #104464
- ConditionalExtractAfterLastActiveElement JIT ARM64-SVE: Add Sve.ConditionalExtract* APIs #104150
- ConditionalExtractAfterLastActiveElementAndReplicate JIT ARM64-SVE: Add Sve.ConditionalExtract* APIs #104150
- ConditionalExtractLastActiveElement JIT ARM64-SVE: Add Sve.ConditionalExtract* APIs #104150
- ConditionalExtractLastActiveElementAndReplicate JIT ARM64-SVE: Add Sve.ConditionalExtract* APIs #104150
- ConditionalSelect Arm64/Sve: Predicated Abs, Predicated/UnPredicated Add, Conditional Select #100743
- CreateBreakAfterMask JIT: Added four SVE
CreateBreak*
APIs #104184 (Future work) Add optimization for CndSel - CreateBreakAfterPropagateMask JIT: Added four SVE
CreateBreak*
APIs #104184 (Future work) Add optimization for CndSel - CreateBreakBeforeMask JIT: Added four SVE
CreateBreak*
APIs #104184 (Future work) Add optimization for CndSel - CreateBreakBeforePropagateMask JIT: Added four SVE
CreateBreak*
APIs #104184 (Future work) Add optimization for CndSel - CreateBreakPropagateMask JIT: Added Sve.CreateBreakPropagateMask #104704
- CreateFalseMaskByte JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskDouble JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskInt16 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskInt32 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskInt64 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskSByte JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskSingle JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskUInt16 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskUInt32 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateFalseMaskUInt64 JIT ARM64-SVE: Add Sve.CreateFalseMask*() #102076
- CreateMaskForFirstActiveElement JIT: Added SVE APIs
CreateMaskForFirstActiveElement
andCreateMaskForNextActiveElement
#104002 - CreateMaskForNextActiveElement JIT: Added SVE APIs
CreateMaskForFirstActiveElement
andCreateMaskForNextActiveElement
#104002 - CreateTrueMaskByte JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskDouble JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskInt16 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskInt32 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskInt64 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskSByte JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskSingle JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskUInt16 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskUInt32 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateTrueMaskUInt64 JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- CreateWhileLessThanMask16Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanMask32Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanMask64Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanMask8Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanOrEqualMask16Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanOrEqualMask32Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanOrEqualMask64Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- CreateWhileLessThanOrEqualMask8Bit JIT ARM64-SVE: Add CreateWhileLessThan* #100949
- (Future item) ExtractAfterLastScalar JIT: ARM64 - Added SVE APIs
ExtractLastVector
,ExtractLastScalar
,ExtractAfterLastVector
,ExtractAfterLastScalar
#103847 - (Future item) ExtractAfterLastVector JIT: ARM64 - Added SVE APIs
ExtractLastVector
,ExtractLastScalar
,ExtractAfterLastVector
,ExtractAfterLastScalar
#103847 - (Future item) ExtractLastScalar JIT: ARM64 - Added SVE APIs
ExtractLastVector
,ExtractLastScalar
,ExtractAfterLastVector
,ExtractAfterLastScalar
#103847 - (Future item) ExtractLastVector JIT: ARM64 - Added SVE APIs
ExtractLastVector
,ExtractLastScalar
,ExtractAfterLastVector
,ExtractAfterLastScalar
#103847 - ExtractVector JIT: Added SVE APIs -
Test*
,ExtractVector
#103739 - TestAnyTrue JIT: Added SVE APIs -
Test*
,ExtractVector
#103739 - TestFirstTrue JIT: Added SVE APIs -
Test*
,ExtractVector
#103739 - TestLastTrue JIT: Added SVE APIs -
Test*
,ExtractVector
#103739
Sve bitwise (Complete)
Full list
- And JIT ARM64-SVE: Add simple bitwise ops #101762
- AndAcross JIT ARM64-SVE: Add simple bitwise ops #101762
- (Future work) AndNot Need to fix Arm64/Sve: Mirror some of the morph optimizations from AVX512 #101933
- BitwiseClear JIT ARM64-SVE: Add BitwiseClear and BooleanNot #101853
- BooleanNot JIT ARM64-SVE: Add BitwiseClear and BooleanNot #101853
- InsertIntoShiftedVector ARM64-SVE: Add Not, InsertIntoShiftedVector #103725
- Not ARM64-SVE: Add Not, InsertIntoShiftedVector #103725
- Or JIT ARM64-SVE: Add simple bitwise ops #101762
- OrAcross JIT ARM64-SVE: Add simple bitwise ops #101762
- (Future work) OrNot Need to fix Arm64/Sve: Mirror some of the morph optimizations from AVX512 #101933
- ShiftLeftLogical ARM64-SVE: Implement
ShiftLeftLogical
,ShiftRightArithmetic
,ShiftRightLogical
#104119 - ShiftRightArithmetic ARM64-SVE: Implement
ShiftLeftLogical
,ShiftRightArithmetic
,ShiftRightLogical
#104119 - ShiftRightArithmeticForDivide ARM64-SVE: Add ShiftRightArithmeticForDivide #104279
- ShiftRightLogical ARM64-SVE: Implement
ShiftLeftLogical
,ShiftRightArithmetic
,ShiftRightLogical
#104119 - Xor JIT ARM64-SVE: Add simple bitwise ops #101762
- XorAcross JIT ARM64-SVE: Add simple bitwise ops #101762
Sve bitmanipulate (Complete)
Full list
- DuplicateSelectedScalarToVector Add support for Sve.DuplicateSelectedScalarToVector() #103228
- ReverseBits Add support for Sve.ReverseBits() #103806
- ReverseElement Add support for Sve.ReverseElementX() #102991
- ReverseElement16 Add support for Sve.ReverseElementX() #102991
- ReverseElement32 Add support for Sve.ReverseElementX() #102991
- ReverseElement8 Add support for Sve.ReverseElementX() #102991
- Splice Add support for Sve.Splice() #103567
- TransposeEven Add support for Sve.TransposeEven/Odd() #103068
- TransposeOdd Add support for Sve.TransposeEven/Odd() #103068
- UnzipEven Add support for Sve.UnzipEven/Odd & Sve.ZipHighLow #101294
- UnzipOdd Add support for Sve.UnzipEven/Odd & Sve.ZipHighLow #101294
- VectorTableLookup Add support for Sve.VectorTableLookup() #103989
- ZipHigh Add support for Sve.UnzipEven/Odd & Sve.ZipHighLow #101294
- ZipLow Add support for Sve.UnzipEven/Odd & Sve.ZipHighLow #101294
Sve loads (Complete)
Full list
- Compute16BitAddresses Arm64/Sve: Implement Compute*BitAddresses APIs #103040
- Compute32BitAddresses Arm64/Sve: Implement Compute*BitAddresses APIs #103040
- Compute64BitAddresses Arm64/Sve: Implement Compute*BitAddresses APIs #103040
- Compute8BitAddresses Arm64/Sve: Implement Compute*BitAddresses APIs #103040
- LoadVector JIT ARM64-SVE: Add TrueMask and LoadVector #98218
- LoadVector128AndReplicateToVector JIT: Added
Sve.LoadVectorNonTemporal/NonFaulting/128AndReplicateToVector
APIs #103392 - LoadVectorByteNonFaultingZeroExtendToInt16 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteNonFaultingZeroExtendToInt32 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteNonFaultingZeroExtendToInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteNonFaultingZeroExtendToUInt16 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteNonFaultingZeroExtendToUInt32 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteNonFaultingZeroExtendToUInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorByteZeroExtendToInt16 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorByteZeroExtendToInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorByteZeroExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorByteZeroExtendToUInt16 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorByteZeroExtendToUInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorByteZeroExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt16NonFaultingSignExtendToInt32 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt16NonFaultingSignExtendToInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt16NonFaultingSignExtendToUInt32 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt16NonFaultingSignExtendToUInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt16SignExtendToInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt16SignExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt16SignExtendToUInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt16SignExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt32NonFaultingSignExtendToInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt32NonFaultingSignExtendToUInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorInt32SignExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorInt32SignExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorNonFaulting JIT: Added
Sve.LoadVectorNonTemporal/NonFaulting/128AndReplicateToVector
APIs #103392 - LoadVectorNonTemporal JIT: Added
Sve.LoadVectorNonTemporal/NonFaulting/128AndReplicateToVector
APIs #103392 - LoadVectorSByteNonFaultingSignExtendToInt16 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteNonFaultingSignExtendToInt32 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteNonFaultingSignExtendToInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteNonFaultingSignExtendToUInt16 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteNonFaultingSignExtendToUInt32 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteNonFaultingSignExtendToUInt64 Arm64/Sve: Implement LoadVector*NonFaultingSignExtendTo* APIs #102903
- LoadVectorSByteSignExtendToInt16 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorSByteSignExtendToInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorSByteSignExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorSByteSignExtendToUInt16 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorSByteSignExtendToUInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorSByteSignExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt16NonFaultingZeroExtendToInt32 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt16NonFaultingZeroExtendToInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt16NonFaultingZeroExtendToUInt32 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt16NonFaultingZeroExtendToUInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt16ZeroExtendToInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt16ZeroExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt16ZeroExtendToUInt32 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt16ZeroExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt32NonFaultingZeroExtendToInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt32NonFaultingZeroExtendToUInt64 JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
- LoadVectorUInt32ZeroExtendToInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorUInt32ZeroExtendToUInt64 JIT ARM64-SVE: Add Sve.LoadVector*ZeroExtendTo*() #101291
- LoadVectorx2 SVE: Added
Load2xVectorAndUnzip
,Load3xVectorAndUnzip
,Load4xVectorAndUnzip
APIs #102180 - LoadVectorx3 SVE: Added
Load2xVectorAndUnzip
,Load3xVectorAndUnzip
,Load4xVectorAndUnzip
APIs #102180 - LoadVectorx4 SVE: Added
Load2xVectorAndUnzip
,Load3xVectorAndUnzip
,Load4xVectorAndUnzip
APIs #102180 - PrefetchBytes JIT: Added SVE
Prefetch*
APIs. #103094 - PrefetchInt16 JIT: Added SVE
Prefetch*
APIs. #103094 - PrefetchInt32 JIT: Added SVE
Prefetch*
APIs. #103094 - PrefetchInt64 JIT: Added SVE
Prefetch*
APIs. #103094
Sve stores (Complete)
Full list
- Store Add support for Sve.Store() #102262
- StoreNarrowing Add support for Sve.StoreNarrowing() #102605
- StoreNonTemporal Add support for Sve.StoreNonTemporal() #102769
Sve maths (Complete)
Full list
- Abs Arm64/Sve: Predicated Abs, Predicated/UnPredicated Add, Conditional Select #100743
- AbsoluteDifference Arm64/Sve: Implement some more Math APIs #102170
- Add Arm64/Sve: Predicated Abs, Predicated/UnPredicated Add, Conditional Select #100743
- AddAcross JIT ARM64-SVE: Add AddAcross #101674
- AddSaturate Arm64/Sve: Implement some more Math APIs #102170
- Divide Arm64/Sve: Implement divide/multiply/subtract Math APIs #101578
- DotProduct Arm64/Sve: Implement Math's DotProduct* APIs #102218
- DotProductBySelectedScalar Arm64/Sve: Implement Math's DotProduct* APIs #102218
- FusedMultiplyAdd Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- FusedMultiplyAddBySelectedScalar Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- FusedMultiplyAddNegated Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- FusedMultiplySubtract Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- FusedMultiplySubtractBySelectedScalar Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- FusedMultiplySubtractNegated Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- Max Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MaxAcross Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MaxNumber Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MaxNumberAcross Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- Min Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MinAcross Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MinNumber Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- MinNumberAcross Arm64/Sve: Implement SVE Math Min*/Max* APIs #101859
- Multiply Arm64/Sve: Implement divide/multiply/subtract Math APIs #101578
- MultiplyAdd Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- MultiplyBySelectedScalar Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- MultiplyExtended Arm64/Sve: Implement some more Math APIs #102170
- MultiplySubtract Arm64/Sve: Implement SVE Math *Multiply* APIs #102007
- Negate Arm64/Sve: Implement some more Math APIs #102170
- SignExtend16 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- SignExtend32 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- SignExtend8 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- SignExtendWideningLower Arm64/Sve: Add SignExtendWidening* and ZeroExtendWidening* math APIs #101743
- SignExtendWideningUpper Arm64/Sve: Add SignExtendWidening* and ZeroExtendWidening* math APIs #101743
- Subtract Arm64/Sve: Implement divide/multiply/subtract Math APIs #101578
- SubtractSaturate Arm64/Sve: Implement some more Math APIs #102170
- ZeroExtend16 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- ZeroExtend32 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- ZeroExtend8 Arm64/Sve: Add SignExtend* and ZeroExtend* math APIs #101702
- ZeroExtendWideningLower Arm64/Sve: Add SignExtendWidening* and ZeroExtendWidening* math APIs #101743
- ZeroExtendWideningUpper Arm64/Sve: Add SignExtendWidening* and ZeroExtendWidening* math APIs #101743
Sve counting (Complete)
Full list
- Count16BitElements JIT ARM64-SVE: Add Count*BitElements #101188
- Count32BitElements JIT ARM64-SVE: Add Count*BitElements #101188
- Count64BitElements JIT ARM64-SVE: Add Count*BitElements #101188
- Count8BitElements JIT ARM64-SVE: Add Count*BitElements #101188
- GetActiveElementCount ARM64-SVE: GetActiveElementCount #102813
- LeadingSignCount ARM64-SVE: LeadingSignCount, LeadingZeroCount, PopCount #102548
- LeadingZeroCount ARM64-SVE: LeadingSignCount, LeadingZeroCount, PopCount #102548
- PopCount ARM64-SVE: LeadingSignCount, LeadingZeroCount, PopCount #102548
- SaturatingDecrementBy16BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingDecrementBy32BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingDecrementBy64BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingDecrementBy8BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingDecrementByActiveElementCount ARM64-SVE: Saturating*ByActiveElementCount #102994
- SaturatingIncrementBy16BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingIncrementBy32BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingIncrementBy64BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingIncrementBy8BitElementCount JIT ARM64-SVE: Add saturating decrement/increment by element count #102315
- SaturatingIncrementByActiveElementCount ARM64-SVE: Saturating*ByActiveElementCount #102994
Low Priority SVE APIs
Sve scatterstores (Complete)
Full list
- Scatter Add support for Sve.Scatter() #104555
- Scatter16BitNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
- Scatter16BitWithByteOffsetsNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
- Scatter32BitNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
- Scatter32BitWithByteOffsetsNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
- Scatter8BitNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
- Scatter8BitWithByteOffsetsNarrowing Add Sve.ScatterXBitYNarrowing() on Arm64 #104720
Sve gatherloads (Complete)
Full list
- GatherPrefetch16Bit ARM64-SVE: GatherPrefetch #103826
- GatherPrefetch32Bit ARM64-SVE: GatherPrefetch #103826
- GatherPrefetch64Bit ARM64-SVE: GatherPrefetch #103826
- GatherPrefetch8Bit ARM64-SVE: GatherPrefetch #103826
- GatherVector ARM64-SVE: gathervector #103159
- GatherVectorByteZeroExtend ARM64-SVE: gathervector extends #103370
- GatherVectorInt16SignExtend ARM64-SVE: gathervector extends #103370
- GatherVectorInt16WithByteOffsetsSignExtend ARM64-SVE: gathervector extends #103370
- GatherVectorInt32SignExtend ARM64-SVE: gathervector extends #103370
- GatherVectorInt32WithByteOffsetsSignExtend ARM64-SVE: gathervector extends #103370
- GatherVectorSByteSignExtend ARM64-SVE: gathervector extends #103370
- GatherVectorUInt16WithByteOffsetsZeroExtend ARM64-SVE: gathervector extends #103370
- GatherVectorUInt16ZeroExtend ARM64-SVE: gathervector extends #103370
- GatherVectorUInt32WithByteOffsetsZeroExtend ARM64-SVE: gathervector extends #103370
- GatherVectorUInt32ZeroExtend ARM64-SVE: gathervector extends #103370
- GatherVectorWithByteOffsets ARM64-SVE: GatherVectorWithByteOffsets #103564
Sve fp (Complete)
Full list
- AddRotateComplex ARM64-SVE: Add
AddRotateComplex
,MultiplyAddRotateComplex
#104926 - AddSequentialAcross ARM64-SVE: Add
AddSequentialAcross
#104640 - ConvertToDouble ARM64-SVE: Add
ConvertToSingle
,ConvertToDouble
; fixCovertTo*
tests #104478 - ConvertToInt32 Arm64/SVE: Implemented
ConvertToInt32
andConvertToUInt32
for float #103098 - ConvertToInt64 Arm64/SVE: Implemented
ConvertToint64
andConvertToUInt64
#104069 - ConvertToSingle ARM64-SVE: Add
ConvertToSingle
,ConvertToDouble
; fixCovertTo*
tests #104478 - ConvertToUInt32 Arm64/SVE: Implemented
ConvertToInt32
andConvertToUInt32
for float #103098 - ConvertToUInt64 Arm64/SVE: Implemented
ConvertToint64
andConvertToUInt64
#104069 - FloatingPointExponentialAccelerator ARM64-SVE: Add
FloatingPointExponentialAccelerator
#104649 - MultiplyAddRotateComplex ARM64-SVE: Add
AddRotateComplex
,MultiplyAddRotateComplex
#104926 - MultiplyAddRotateComplexBySelectedScalar ARM64-SVE: Add
MultiplyAddRotateComplexBySelectedScalar
#105002 - ReciprocalEstimate Arm64/SVE: Implemented
ReciprocalEstimate
,ReciprocalExponent
,ReciprocalSqrtEstimate
,ReciprocalSqrtStep
, andReciprocalStep
#103673 - ReciprocalExponent Arm64/SVE: Implemented
ReciprocalEstimate
,ReciprocalExponent
,ReciprocalSqrtEstimate
,ReciprocalSqrtStep
, andReciprocalStep
#103673 - ReciprocalSqrtEstimate Arm64/SVE: Implemented
ReciprocalEstimate
,ReciprocalExponent
,ReciprocalSqrtEstimate
,ReciprocalSqrtStep
, andReciprocalStep
#103673 - ReciprocalSqrtStep Arm64/SVE: Implemented
ReciprocalEstimate
,ReciprocalExponent
,ReciprocalSqrtEstimate
,ReciprocalSqrtStep
, andReciprocalStep
#103673 - ReciprocalStep Arm64/SVE: Implemented
ReciprocalEstimate
,ReciprocalExponent
,ReciprocalSqrtEstimate
,ReciprocalSqrtStep
, andReciprocalStep
#103673 - RoundAwayFromZero Arm64/SVE: Implemented
RoundAwayFromZero
,RoundToNearest
,RouteToNegativeInfininty
,RoundToPositiveInfinity
,RoundToZero
#103588 - RoundToNearest Arm64/SVE: Implemented
RoundAwayFromZero
,RoundToNearest
,RouteToNegativeInfininty
,RoundToPositiveInfinity
,RoundToZero
#103588 - RoundToNegativeInfinity Arm64/SVE: Implemented
RoundAwayFromZero
,RoundToNearest
,RouteToNegativeInfininty
,RoundToPositiveInfinity
,RoundToZero
#103588 - RoundToPositiveInfinity Arm64/SVE: Implemented
RoundAwayFromZero
,RoundToNearest
,RouteToNegativeInfininty
,RoundToPositiveInfinity
,RoundToZero
#103588 - RoundToZero Arm64/SVE: Implemented
RoundAwayFromZero
,RoundToNearest
,RouteToNegativeInfininty
,RoundToPositiveInfinity
,RoundToZero
#103588 - Scale Arm64/SVE: Implemented
Scale
andSqrt
#103663 - Sqrt Arm64/SVE: Implemented
Scale
andSqrt
#103663 - TrigonometricMultiplyAddCoefficient ARM64-SVE: Add
TrigonometricMultiplyAddCoefficient
#104697 - TrigonometricSelectCoefficient ARM64-SVE: Add
TrigonometricSelectCoefficient
,TrigonometricStartingValue
#104681 - TrigonometricStartingValue ARM64-SVE: Add
TrigonometricSelectCoefficient
,TrigonometricStartingValue
#104681
Sve firstfaulting (Complete)
Full list
- GatherVectorByteZeroExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorFirstFaulting JIT: Added SVE
GetFfr
,SetFfr
,LoadVectorFirstFaulting
,GatherVectorFirstFaulting
#104502 - GatherVectorInt16SignExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorInt16WithByteOffsetsSignExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorInt32SignExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorInt32WithByteOffsetsSignExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorSByteSignExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorUInt16WithByteOffsetsZeroExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorUInt16ZeroExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorUInt32WithByteOffsetsZeroExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorUInt32ZeroExtendFirstFaulting (Swapnil) Add support for Add Sve.GatherVectorUInt*ZeroExtendFirstFaulting() #105030
- GatherVectorWithByteOffsetFirstFaulting (Aman) ARM64-SVE: Add
GatherVectorWithByteOffsetFirstFaulting
#106199 - GetFfr JIT: Added SVE
GetFfr
,SetFfr
,LoadVectorFirstFaulting
,GatherVectorFirstFaulting
#104502 - LoadVectorByteZeroExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- LoadVectorFirstFaulting JIT: Added SVE
GetFfr
,SetFfr
,LoadVectorFirstFaulting
,GatherVectorFirstFaulting
#104502 - LoadVectorInt16SignExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- LoadVectorInt32SignExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- LoadVectorSByteSignExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- LoadVectorUInt16ZeroExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- LoadVectorUInt32ZeroExtendFirstFaulting JIT ARM64-SVE: Add Sve.LoadVector*FirstFaulting APIs #104964
- SetFfr JIT: Added SVE
GetFfr
,SetFfr
,LoadVectorFirstFaulting
,GatherVectorFirstFaulting
#104502
SVE2 APIs
Full list
Sve2 scatterstores
- Scatter16BitNarrowing
- Scatter16BitWithByteOffsetsNarrowing
- Scatter32BitNarrowing
- Scatter32BitWithByteOffsetsNarrowing
- Scatter8BitNarrowing
- Scatter8BitWithByteOffsetsNarrowing
- ScatterNonTemporal
Sve2 maths
- AbsoluteDifferenceAdd
- AbsoluteDifferenceAddWideningLower
- AbsoluteDifferenceAddWideningUpper
- AbsoluteDifferenceWideningLower
- AbsoluteDifferenceWideningUpper
- AddCarryWideningLower
- AddCarryWideningUpper
- AddHighNarowingLower
- AddHighNarowingUpper
- AddPairwise
- AddPairwiseWidening
- AddSaturate
- AddSaturateWithSignedAddend
- AddSaturateWithUnsignedAddend
- AddWideLower
- AddWideUpper
- AddWideningLower
- AddWideningLowerUpper
- AddWideningUpper
- DotProductComplex
- HalvingAdd
- HalvingSubtract
- HalvingSubtractReversed
- MaxNumberPairwise
- MaxPairwise
- MinNumberPairwise
- MinPairwise
- MultiplyAddBySelectedScalar
- MultiplyAddWideningLower
- MultiplyAddWideningUpper
- MultiplyBySelectedScalar
- MultiplySubtractBySelectedScalar
- MultiplySubtractWideningLower
- MultiplySubtractWideningUpper
- MultiplyWideningLower
- MultiplyWideningUpper
- PolynomialMultiply
- PolynomialMultiplyWideningLower
- PolynomialMultiplyWideningUpper
- RoundingAddHighNarowingLower
- RoundingAddHighNarowingUpper
- RoundingHalvingAdd
- RoundingSubtractHighNarowingLower
- RoundingSubtractHighNarowingUpper
- SaturatingAbs
- SaturatingDoublingMultiplyAddWideningLower
- SaturatingDoublingMultiplyAddWideningLowerUpper
- SaturatingDoublingMultiplyAddWideningUpper
- SaturatingDoublingMultiplyHigh
- SaturatingDoublingMultiplySubtractWideningLower
- SaturatingDoublingMultiplySubtractWideningLowerUpper
- SaturatingDoublingMultiplySubtractWideningUpper
- SaturatingDoublingMultiplyWideningLower
- SaturatingDoublingMultiplyWideningUpper
- SaturatingNegate
- SaturatingRoundingDoublingMultiplyAddHigh
- SaturatingRoundingDoublingMultiplyHigh
- SaturatingRoundingDoublingMultiplySubtractHigh
- SubtractHighNarowingLower
- SubtractHighNarowingUpper
- SubtractSaturate
- SubtractSaturateReversed
- SubtractWideLower
- SubtractWideUpper
- SubtractWideningLower
- SubtractWideningLowerUpper
- SubtractWideningUpper
- SubtractWideningUpperLower
- SubtractWithBorrowWideningLower
- SubtractWithBorrowWideningUpper
Sve2 mask
- CreateWhileGreaterThanMask
- CreateWhileGreaterThanOrEqualMask
- CreateWhileReadAfterWriteMask
- CreateWhileWriteAfterReadMask
- Match
- NoMatch
- SaturatingExtractNarrowingLower
- SaturatingExtractNarrowingUpper
- SaturatingExtractUnsignedNarrowingLower
- SaturatingExtractUnsignedNarrowingUpper
Sve2 gatherloads
- GatherVectorByteZeroExtendNonTemporal
- GatherVectorInt16SignExtendNonTemporal
- GatherVectorInt16WithByteOffsetsSignExtendNonTemporal
- GatherVectorInt32SignExtendNonTemporal
- GatherVectorInt32WithByteOffsetsSignExtendNonTemporal
- GatherVectorNonTemporal
- GatherVectorSByteSignExtendNonTemporal
- GatherVectorUInt16WithByteOffsetsZeroExtendNonTemporal
- GatherVectorUInt16ZeroExtendNonTemporal
- GatherVectorUInt32WithByteOffsetsZeroExtendNonTemporal
- GatherVectorUInt32ZeroExtendNonTemporal
Sve2 fp
- AddRotateComplex
- DownConvertNarrowingUpper
- DownConvertRoundingOdd
- DownConvertRoundingOddUpper
- Log2
- MultiplyAddRotateComplex
- MultiplyAddRotateComplexBySelectedScalar
- ReciprocalEstimate
- ReciprocalSqrtEstimate
- SaturatingComplexAddRotate
- SaturatingRoundingDoublingComplexMultiplyAddHighRotate
- UpConvertWideningUpper
Sve2 counting
- CountMatchingElements
- CountMatchingElementsIn128BitSegments
Sve2 bitwise
- BitwiseClearXor
- BitwiseSelect
- BitwiseSelectLeftInverted
- BitwiseSelectRightInverted
- ShiftArithmeticRounded
- ShiftArithmeticRoundedSaturate
- ShiftArithmeticSaturate
- ShiftLeftAndInsert
- ShiftLeftLogicalSaturate
- ShiftLeftLogicalSaturateUnsigned
- ShiftLeftLogicalWideningEven
- ShiftLeftLogicalWideningOdd
- ShiftLogicalRounded
- ShiftLogicalRoundedSaturate
- ShiftRightAndInsert
- ShiftRightArithmeticAdd
- ShiftRightArithmeticNarrowingSaturateEven
- ShiftRightArithmeticNarrowingSaturateOdd
- ShiftRightArithmeticNarrowingSaturateUnsignedEven
- ShiftRightArithmeticNarrowingSaturateUnsignedOdd
- ShiftRightArithmeticRounded
- ShiftRightArithmeticRoundedAdd
- ShiftRightArithmeticRoundedNarrowingSaturateEven
- ShiftRightArithmeticRoundedNarrowingSaturateOdd
- ShiftRightArithmeticRoundedNarrowingSaturateUnsignedEven
- ShiftRightArithmeticRoundedNarrowingSaturateUnsignedOdd
- ShiftRightLogicalAdd
- ShiftRightLogicalNarrowingEven
- ShiftRightLogicalNarrowingOdd
- ShiftRightLogicalRounded
- ShiftRightLogicalRoundedAdd
- ShiftRightLogicalRoundedNarrowingEven
- ShiftRightLogicalRoundedNarrowingOdd
- ShiftRightLogicalRoundedNarrowingSaturateEven
- ShiftRightLogicalRoundedNarrowingSaturateOdd
- Xor
- XorRotateRight
Sve2 bitmanipulate
- InterleavingXorLowerUpper
- InterleavingXorUpperLower
- MoveWideningLower
- MoveWideningUpper
- VectorTableLookup
- VectorTableLookupExtension
SveBf16
- Bfloat16DotProduct
- Bfloat16MatrixMultiplyAccumulate
- Bfloat16MultiplyAddWideningToSinglePrecisionLower
- Bfloat16MultiplyAddWideningToSinglePrecisionUpper
- ConcatenateEvenInt128FromTwoInputs
- ConcatenateOddInt128FromTwoInputs
- ConditionalExtractAfterLastActiveElement
- ConditionalExtractAfterLastActiveElementAndReplicate
- ConditionalExtractLastActiveElement
- ConditionalExtractLastActiveElementAndReplicate
- ConditionalSelect
- ConvertToBFloat16
- CreateFalseMaskBFloat16
- CreateTrueMaskBFloat16
- CreateWhileReadAfterWriteMask
- CreateWhileWriteAfterReadMask
- DotProductBySelectedScalar
- DownConvertNarrowingUpper
- DuplicateSelectedScalarToVector
- ExtractAfterLastScalar
- ExtractAfterLastVector
- ExtractLastScalar
- ExtractLastVector
- ExtractVector
- GetActiveElementCount
- InsertIntoShiftedVector
- InterleaveEvenInt128FromTwoInputs
- InterleaveInt128FromHighHalvesOfTwoInputs
- InterleaveInt128FromLowHalvesOfTwoInputs
- InterleaveOddInt128FromTwoInputs
- LoadVector
- LoadVector128AndReplicateToVector
- LoadVector256AndReplicateToVector
- LoadVectorFirstFaulting
- LoadVectorNonFaulting
- LoadVectorNonTemporal
- Load2xVector
- Load3xVector
- Load4xVector
- PopCount
- ReverseElement
- Splice
- Store
- StoreNonTemporal
- TransposeEven
- TransposeOdd
- UnzipEven
- UnzipOdd
- VectorTableLookup
- VectorTableLookupExtension
- ZipHigh
- ZipLow
SveF32mm
- MatrixMultiplyAccumulate
SveF64mm
- ConcatenateEvenInt128FromTwoInputs
- ConcatenateOddInt128FromTwoInputs
- InterleaveEvenInt128FromTwoInputs
- InterleaveInt128FromHighHalvesOfTwoInputs
- InterleaveInt128FromLowHalvesOfTwoInputs
- InterleaveOddInt128FromTwoInputs
- LoadVector256AndReplicateToVector
- MatrixMultiplyAccumulate
SveFp16
- Abs
- AbsoluteCompareGreaterThan
- AbsoluteCompareGreaterThanOrEqual
- AbsoluteCompareLessThan
- AbsoluteCompareLessThanOrEqual
- AbsoluteDifference
- Add
- AddAcross
- AddPairwise
- AddRotateComplex
- AddSequentialAcross
- CompareEqual
- CompareGreaterThan
- CompareGreaterThanOrEqual
- CompareLessThan
- CompareLessThanOrEqual
- CompareNotEqualTo
- CompareUnordered
- ConcatenateEvenInt128FromTwoInputs
- ConcatenateOddInt128FromTwoInputs
- ConditionalExtractAfterLastActiveElement
- ConditionalExtractAfterLastActiveElementAndReplicate
- ConditionalExtractLastActiveElement
- ConditionalExtractLastActiveElementAndReplicate
- ConditionalSelect
- ConvertToDouble
- ConvertToHalf
- ConvertToInt16
- ConvertToInt32
- ConvertToInt64
- ConvertToSingle
- ConvertToUInt16
- ConvertToUInt32
- ConvertToUInt64
- CreateFalseMaskHalf
- CreateTrueMaskHalf
- CreateWhileReadAfterWriteMask
- CreateWhileWriteAfterReadMask
- Divide
- DownConvertNarrowingUpper
- DuplicateSelectedScalarToVector
- ExtractAfterLastScalar
- ExtractAfterLastVector
- ExtractLastScalar
- ExtractLastVector
- ExtractVector
- FloatingPointExponentialAccelerator
- FusedMultiplyAdd
- FusedMultiplyAddBySelectedScalar
- FusedMultiplyAddNegated
- FusedMultiplySubtract
- FusedMultiplySubtractBySelectedScalar
- FusedMultiplySubtractNegated
- GetActiveElementCount
- InsertIntoShiftedVector
- InterleaveEvenInt128FromTwoInputs
- InterleaveInt128FromHighHalvesOfTwoInputs
- InterleaveInt128FromLowHalvesOfTwoInputs
- InterleaveOddInt128FromTwoInputs
- LoadVector
- LoadVector128AndReplicateToVector
- LoadVector256AndReplicateToVector
- LoadVectorFirstFaulting
- LoadVectorNonFaulting
- LoadVectorNonTemporal
- LoadVectorx2
- LoadVectorx3
- LoadVectorx4
- Log2
- Max
- MaxAcross
- MaxNumber
- MaxNumberAcross
- MaxNumberPairwise
- MaxPairwise
- Min
- MinAcross
- MinNumber
- MinNumberAcross
- MinNumberPairwise
- MinPairwise
- Multiply
- MultiplyAddRotateComplex
- MultiplyAddRotateComplexBySelectedScalar
- MultiplyAddWideningLower
- MultiplyAddWideningUpper
- MultiplyBySelectedScalar
- MultiplyExtended
- MultiplySubtractWideningLower
- MultiplySubtractWideningUpper
- Negate
- PopCount
- ReciprocalEstimate
- ReciprocalExponent
- ReciprocalSqrtEstimate
- ReciprocalSqrtStep
- ReciprocalStep
- ReverseElement
- RoundAwayFromZero
- RoundToNearest
- RoundToNegativeInfinity
- RoundToPositiveInfinity
- RoundToZero
- Scale
- Splice
- Sqrt
- Store
- StoreNonTemporal
- Subtract
- TransposeEven
- TransposeOdd
- TrigonometricMultiplyAddCoefficient
- TrigonometricSelectCoefficient
- TrigonometricStartingValue
- UnzipEven
- UnzipOdd
- UpConvertWideningUpper
- VectorTableLookup
- VectorTableLookupExtension
- ZipHigh
- ZipLow
SveI8mm
- DotProductSignedUnsigned
- DotProductUnsignedSigned
- MatrixMultiplyAccumulate
- MatrixMultiplyAccumulateUnsignedSigned
Sha3
- BitwiseClearXor
- BitwiseRotateLeftBy1AndXor
- Xor
- XorRotateRight
Sm4
- Sm4EncryptionAndDecryption
- Sm4KeyUpdates
SveAes
- AesInverseMixColumns
- AesMixColumns
- AesSingleRoundDecryption
- AesSingleRoundEncryption
- PolynomialMultiplyWideningLower
- PolynomialMultiplyWideningUpper
SveBitperm
- GatherLowerBitsFromPositionsSelectedByBitmask
- GroupBitsToRightOrLeftAsSelectedByBitmask
- ScatterLowerBitsIntoPositionsSelectedByBitmask
SveSha3
- BitwiseRotateLeftBy1AndXor
SveSm4
- Sm4EncryptionAndDecryption
- Sm4KeyUpdates
Credits to @a74nh for populating the list and also some files in https://github.com/a74nh/runtime/tree/api_github/sve_api that will help to implement them.
Contributes to #93095
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done