Skip to content

Conversation

@tannergooding
Copy link
Member

I was working on a emitValidateIns method (debug-only, run as part of emitDispIns) which verifies that the various instrDesc we encounter are "correct". As part of this, I found a few places where things are currently incorrect and should be updated. -- The emitValidateIns will go up eventually, but as its own PR given size/complexity and because there are still more validation that needs to be done.

In particular:

  • Several instructions were not going down the VEX path and were giving potentially incorrect disassembly
  • Several instructions were not passing down a valid emitAttr (they passed down a scalar size to a vector instruction)
  • Several instructions were using an incorrect insFormat

Most of these were straightfoward fixes. However, we also had but were not using some "scheduling" information that was defined as part of the emitfmt data. I added functions to be able to query that information and utilize it in a few places rather than checking giant switch tables. -- There are more places this could be used as well, such as in the GC liveness update checks and to remove code duplication in places like emitDispIns. However, those can be done in their own PRs.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 28, 2023
@ghost ghost assigned tannergooding Apr 28, 2023
@ghost
Copy link

ghost commented Apr 28, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

I was working on a emitValidateIns method (debug-only, run as part of emitDispIns) which verifies that the various instrDesc we encounter are "correct". As part of this, I found a few places where things are currently incorrect and should be updated. -- The emitValidateIns will go up eventually, but as its own PR given size/complexity and because there are still more validation that needs to be done.

In particular:

  • Several instructions were not going down the VEX path and were giving potentially incorrect disassembly
  • Several instructions were not passing down a valid emitAttr (they passed down a scalar size to a vector instruction)
  • Several instructions were using an incorrect insFormat

Most of these were straightfoward fixes. However, we also had but were not using some "scheduling" information that was defined as part of the emitfmt data. I added functions to be able to query that information and utilize it in a few places rather than checking giant switch tables. -- There are more places this could be used as well, such as in the GC liveness update checks and to remove code duplication in places like emitDispIns. However, those can be done in their own PRs.

Author: tannergooding
Assignees: tannergooding
Labels:

area-CodeGen-coreclr

Milestone: -

Comment on lines -198 to -199
Copy link
Member Author

@tannergooding tannergooding Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EVEX versions of these instructions produce a K register, not a XMM/YMM/ZMM register.

Since they need entirely different handling (and cannot be freely converted between the two forms), they need to be their own instructions.

Comment on lines -338 to -339
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b/w instructions are typically FULL_MEM while d/q are typically FULL (with a few exceptions)

In this case, they should've been FULL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The various test instructions don't mutate either input; the operate in a temp and set flags.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FMA instructions are all RMW

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EVEX only instructions don't need to specify REX_W1_EVEX since they don't have a different encoding for non-EVEX.

They can just specify REX_W1 directly and go down the faster path

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRC32 is an RMW instruction

@tannergooding tannergooding marked this pull request as ready for review April 30, 2023 22:58
@tannergooding
Copy link
Member Author

CC. @dotnet/jit-contrib

Diffs

We see good TP wins of up to -0.20% in minopts and up to -0.03% in fullopts.

We also see some smaller assembly both from consistent use of the VEX aware path and from using the correct insFormat for various instructions as it allows various instruction peepholes to light-up better.

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

tannergooding commented May 1, 2023

Fuzzlyn failure is pre-existing and repros on .NET 8 Preview 3: #85602

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes make sense to me assuming CI is green

@tannergooding
Copy link
Member Author

Merged in dotnet/main to resolve the conflicts caused by #85594

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

JitStress failures are #85608

@sebastienros
Copy link
Member

Do you think this PR could explain this improvement?

@tannergooding
Copy link
Member Author

It's possible, but hard to say for certain without seeing the codegen.

This fixed up a number of code paths dealing with vectors to be VEX aware and emit more optimal codegen.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants