Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure GetElement are marked as having an immediate operand for Arm64 #104204

Merged
merged 2 commits into from
Jun 30, 2024

Conversation

tannergooding
Copy link
Member

No description provided.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 30, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@tannergooding tannergooding marked this pull request as ready for review June 30, 2024 05:29
@tannergooding
Copy link
Member Author

CC. @EgorBo, this fixes the minor codegen issue we found

@tannergooding
Copy link
Member Author

tannergooding commented Jun 30, 2024

Guess it's not so minor, since we weren't doing containment at all (for the address or for the index):

Overall (-34,888 bytes)
MinOpts (-5,128 bytes)
FullOpts (-29,760 bytes)

Most of these are simple cases such as:

-            mov     w0, #2
             dup     s16, v16.s[2]

-or-

-            mov     w6, #1
             umov    x6, v16.d[1]

But then we also get cases such as:

-            ldr     q17, [fp, #0x30]	// [V02 loc0]
-            mov     w0, #3
-            dup     s17, v17.s[3]
+            ldr     s17, [fp, #0x3C]	// [V02 loc0+0x0c]

There's a lot of code that ends up extracting or moving elements, so this gets some decent payoffs overall.

Edit: The below is part of a blr (method call) which is why its still getting loaded, it is actually used so we're all good 🎉
It does, however, look like there's some other missed containment opportunity as well:

-            mov     w1, #2
             mov     v8.d[1], v9.d[0]
             dup     s0, v8.s[2]
+            mov     w1, #2
             ldr     x0, [x20, #0x08]

Going to see if I can identify where that is, but I wouldn't block this on that.

@@ -155,7 +155,7 @@ HARDWARE_INTRINSIC(Vector128, EqualsAny,
HARDWARE_INTRINSIC(Vector128, ExtractMostSignificantBits, 16, 1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId|HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(Vector128, Floor, 16, 1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId)
HARDWARE_INTRINSIC(Vector128, FusedMultiplyAdd, 16, 3, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId)
HARDWARE_INTRINSIC(Vector128, GetElement, 16, 2, true, {INS_smov, INS_umov, INS_smov, INS_umov, INS_smov, INS_umov, INS_umov, INS_umov, INS_dup, INS_dup}, HW_Category_Helper, HW_Flag_SpecialImport|HW_Flag_SpecialCodeGen|HW_Flag_BaseTypeFromFirstArg)
HARDWARE_INTRINSIC(Vector128, GetElement, 16, 2, true, {INS_smov, INS_umov, INS_smov, INS_umov, INS_smov, INS_umov, INS_umov, INS_umov, INS_dup, INS_dup}, HW_Category_Helper, HW_Flag_SpecialImport|HW_Flag_SpecialCodeGen|HW_Flag_BaseTypeFromFirstArg|HW_Flag_SupportsContainment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing I noticed that GetElement is not marked as HW_Flag_SupportsContainment for x64 but it still works there 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They’re functionally inverses of each other.

On Arm64 containment is a rare consideration and so is opt-in, while on x64 it’s a common thing for almost every operation and is instead opt-out

@tannergooding tannergooding merged commit a7709e5 into dotnet:main Jun 30, 2024
107 checks passed
@tannergooding tannergooding deleted the arm-contain branch June 30, 2024 13:06
@github-actions github-actions bot locked and limited conversation to collaborators Jul 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants