Skip to content

Conversation

@SwapnilGaikwad
Copy link
Contributor

Contribute towards #99957.

@ghost
Copy link

ghost commented May 15, 2024

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 15, 2024
@SwapnilGaikwad
Copy link
Contributor Author

@a74nh @kunalspathak @dotnet/arm64-contrib @arch-arm64-sve

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@kunalspathak kunalspathak self-requested a review May 15, 2024 17:14
Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big PR and thanks for working on it. Overall, looks good except some nit comments. Also, please run the tests, including the stress tests and share the results.

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label May 15, 2024
break;
}

case NI_Sve_Store:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at #102180 reminded me that we can do the following....

For NI_Sve_Store remove the Special Code Gen flag and this case statement. This will cause the generic code to call emitIns_R_R_R().

You can now add a special case in emitIns_R_R_R() so that for NI_Sve_Store it just calls down to emitIns_R_R_R_I() with 0 for the immediate. Eg:

case INS_sve_ld1b:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious...

NI_AdvSimd_Store uses gtNewSimdStoreNode() which eventually gets generated in genCodeForStoreInd(). This has extra code, for example, isvolatile checks.

Meanwhile NI_AdvSimd_StoreSelectedScalar imports like a normal hwintrinsic and gets generated in hwintrinsiccodegenarm64.cpp. This is the same for the various other stores.

Is there a special reason for using genCodeForStoreInd?

Why do the other stores not use genCodeForStoreInd?

Should NI_Sve_Store use genCodeForStoreInd?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to double check on this one. Opened #102347

@SwapnilGaikwad SwapnilGaikwad marked this pull request as ready for review May 16, 2024 13:21
@SwapnilGaikwad
Copy link
Contributor Author

SwapnilGaikwad commented May 16, 2024

please run the tests, including the stress tests and share the results

Stress tests didn't flag anything
===================Running default===================
------------------- {} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_sbyte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_short() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_int() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_long() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_byte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_ushort() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStore_ulong() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_sbyte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_short() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_int() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_long() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_byte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_ushort() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex2_ulong() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_sbyte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_short() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_int() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_long() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_byte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_ushort() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex3_ulong() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_float() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_double() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_sbyte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_short() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_int() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_long() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_byte() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_ushort() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_uint() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.SveStorex4_ulong() : 7
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

@kunalspathak
Copy link
Contributor

As noted in #102180 (comment), these should be StoreAndZip APIs instead of Store APIs because of the operation it is performing.

image

https://dougallj.github.io/asil/doc/st4d_z_p_bi_64.html

Similar API was named as StoreAndZip in AdvSimd.

break;
}

case NI_Sve_Store:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to double check on this one. Opened #102347

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@kunalspathak kunalspathak merged commit f818517 into dotnet:main May 22, 2024
@SwapnilGaikwad SwapnilGaikwad deleted the github-sve-store branch May 23, 2024 10:20
steveharter pushed a commit to steveharter/runtime that referenced this pull request May 28, 2024
* Add support for Sve.Store()

* Fix formatting issues

* Remove incorrect instructions from comment

* Rename Sve.Store() -> Sve.StoreAndZip()

* Refactor test templates
Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
* Add support for Sve.Store()

* Fix formatting issues

* Remove incorrect instructions from comment

* Rename Sve.Store() -> Sve.StoreAndZip()

* Refactor test templates
@github-actions github-actions bot locked and limited conversation to collaborators Jun 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Runtime.Intrinsics arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants