Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add X86Serialize hardware intrinsic. #68677

Merged
merged 1 commit into from
May 11, 2022

Conversation

anthonycanino
Copy link
Contributor

This PR implements #66467 for RyuJIT.

Some questions to address:

  1. Do we need an X64 ISA for X86Serialize, since there are no X64 specific API/instructions?

@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels Apr 28, 2022
@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 28, 2022
@ghost
Copy link

ghost commented Apr 28, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR implements #66467 for RyuJIT.

Some questions to address:

  1. Do we need an X64 ISA for X86Serialize, since there are no X64 specific API/instructions?
Author: anthonycanino
Assignees: -
Labels:

area-CodeGen-coreclr, new-api-needs-documentation

Milestone: -

@anthonycanino anthonycanino marked this pull request as draft April 28, 2022 20:43
@tannergooding
Copy link
Member

Do we need an X64 ISA for X86Serialize, since there are no X64 specific API/instructions?

Yes. This tracks the corresponding X86Serialize.X64.IsSupported check which while seemingly unnecessary actually allows versioning and correct API usage.

Effectively X86Serialize.X64.IsSupported implies X86Serialize.IsSupported && Environment.Is64BitProcess. This also implies any that base ISAs (just X86Base and X86Base.X64.IsSupported) are also true.

If we didn't expose this class and corresponding "ISA", then X86Serialize.X64.IsSupported would resolve to X86Base.X64.IsSupported and may report true when X86Serialize.IsSupported reports false, leading to UX issues.

@anthonycanino
Copy link
Contributor Author

It looks like the failures are from a permission denied issue. I think that is related to #68715.

The superpmi failures are separate, but I believe that is expected as there are no mch files for a new JIT version id/interface?

Also, this PR implements the serialize instruction for RyuJIT. Is it possible to implement the intrinsic for mono in a follow up PR? @tannergooding @fanyang-mono

@fanyang-mono
Copy link
Member

It looks like the failures are from a permission denied issue. I think that is related to #68715.

The superpmi failures are separate, but I believe that is expected as there are no mch files for a new JIT version id/interface?

Also, this PR implements the serialize instruction for RyuJIT. Is it possible to implement the intrinsic for mono in a follow up PR? @tannergooding @fanyang-mono

Feel free to create an issue to track the need of the Mono implementation of this, if you are not planing to implement it with this PR.

@@ -16275,6 +16275,13 @@ emitter::insExecutionCharacteristics emitter::getInsExecutionCharacteristics(ins
break;
}

case INS_serialize:
{
result.insLatency = PERFSCORE_LATENCY_140C;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tannergooding do you have a recommendation for a perfscore for serialize? serialize will take as long as it takes to flush the instruction pipeline, i.e., its latency/throughput is dependent on the instructions in the pipeline. I can try to get a minimum latency/throughput, or we could try a worse case upper bound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've based our previous numbers based on the official timing numbers for Skylake (from Intel® 64 and IA-32 Architectures Software Developer Manuals iirc).

This is really just meant to be a rough heuristic and so an estimate or average case should be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are looking to get these numbers internally. For the time being, I have set both to be 1 cycle. Can we update these values once we have the numbers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, provided an issue is logged tracking it being updated.

Could we set it to something slightly more meaningful than 1 in the meantime? I'd expect this to at least be as expensive as a fencing operation (lfence, mfence, sfence).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mirrored what was done for mfence, and set it to 50 cycles (more expensive than mfence, less expensive than cpuid which is ~100 cycles per the manual.

This seem ok? I will open up an issue now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems great. Thanks!

@anthonycanino
Copy link
Contributor Author

It looks like the failures are from a permission denied issue. I think that is related to #68715.
The superpmi failures are separate, but I believe that is expected as there are no mch files for a new JIT version id/interface?
Also, this PR implements the serialize instruction for RyuJIT. Is it possible to implement the intrinsic for mono in a follow up PR? @tannergooding @fanyang-mono

Feel free to create an issue to track the need of the Mono implementation of this, if you are not planing to implement it with this PR.

Will do. Thanks!.

@anthonycanino
Copy link
Contributor Author

I've created the two issues as discussed and made the changes above. Can I get a full review now (as it was in draft until yesterday)?

@tannergooding
Copy link
Member

CC. @dotnet/jit-contrib for second review.

@tannergooding
Copy link
Member

Going to close and re-open this to restart CI against the latest HEAD. It's been a couple days but it also looks like the issues were unrelated infra issues that had cropped up.

@tannergooding
Copy link
Member

@BruceForstall or @jakobbotsch, is SuperPMI expected to fail in builds with JIT/EE Version changes?

In particular, these are all reporting:

Using JIT/EE Version from mcs: f2a217c4-2a69-4308-99ce-8292c6763776

No Azure Storage MCH files to download from f2a217c4-2a69-4308-99ce-8292c6763776/linux/arm64/

[21:25:01] No MCH files found to replay

If so, this should be good to merge (just wanting to ensure I'm not missing skipping some subtle other issue here).

@jakobbotsch
Copy link
Member

@BruceForstall or @jakobbotsch, is SuperPMI expected to fail in builds with JIT/EE Version changes?

Yes (but probably something we should improve). We should perhaps see if we can skip superpmi-diffs and superpmi-replay on JIT-EE GUID changes.

@BruceForstall
Copy link
Member

superpmi-replay and superpmi-diffs already are attempting to avoid running on JIT-EE GUID changes, but apparently it's not doing what we hoped?

# Only run on changes to the JIT directory. Don't run if the JIT-EE GUID has changed,
# since there won't be any SuperPMI collections with the new GUID until the collection
# pipeline completes after this PR is merged.
pr:
  branches:
    include:
    - main
  paths:
    include:
    - src/coreclr/jit/*
    - src/coreclr/tools/superpmi/*
    exclude:
    - src/coreclr/inc/jiteeversionguid.h

@tannergooding
Copy link
Member

Merging. Will log a tracking issue for the superpmi diffs running even when the JIT/EE version changes

@tannergooding tannergooding merged commit dc7728f into dotnet:main May 11, 2022
@tannergooding
Copy link
Member

Thanks for the contribution @anthonycanino!

@tannergooding
Copy link
Member

Logged #69197

@anthonycanino
Copy link
Contributor Author

Thanks for the reviews @tannergooding, @BruceForstall and @fanyang-mono !

@ghost ghost locked as resolved and limited conversation to collaborators Jun 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants