-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Adding support for the 2-byte VEX encoding to the emitter #21453
Conversation
|
CC. @CarolEidt, @fiigii |
| // | ||
| // Now output VEX prefix leaving the 4-byte opcode | ||
|
|
||
| if ((vexPrefix & 0xFFFF7F80) == 0x00C46100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had initially started doing this by updating AddVexPrefix to support both the 2 and 3-byte encodings and additionally adding a Supports2ByteVexPrefix flag to the instruction metadata in instrsxarch.h. However, that was leading to a cascade of changes and required knowledge of things like: "is the instruction using any extended registers" to be pre-checked and carried around.
I ended up realizing that, at the point we actually emit the encoding, we have all the relevant information and can just make the decision there, so this ended up being significantly simpler to support than initially expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add several assertions in this 2-byte VEX encoding section? e.g.,
assert(!TakesRexWPrefix(ins, attr));
assert(!IsExtendedReg(reg));There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, we don't have access to the attr or the reg information, so we can't readily do either of these asserts.
I do think it would be useful to have these types of asserts elsewhere, but I think the only place we logically have all the information is in emitDispIns, where we have both the instrDesc and the emitted code bytes (but emitDispIns needs some of its own cleanup as well). I think this would also be a useful place to have asserts like that INS_mov_i2xmm doesn't have an attr of EA_16BYTE (since one register is implied 16-byte and the attr needs to be used to determine if we are 32-bit or 64-bit instead).
- @CarolEidt, thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think it's a bad idea to have asserts in emitDispIns, at least those that don't directly relate to displaying the instruction. This is because we don't want to have new asserts appearing in a dump or disassembly. Better would be to add an emitCheckIns or something, that can always be invoked on the instruction when DEBUG is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've logged https://github.com/dotnet/coreclr/issues/21472 to track adding such a function. If no-one has any objections, I think it would be good to handle that in a separate PR (as we would already be in a broken state if the W-bit wasn't set and it should have been at this point, among other checks that emitCheckIns could do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to handle that in a separate PR
It is okay. I just worried about potential corner case failures, as this is a very fundamental change that impacts all the floating point and SIMD code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is reasonable, and probably preferable, to do it in a separate PR.
|
@dotnet-bot test Windows_NT x64 Checked jitincompletehwintrinsic @dotnet-bot test Windows_NT x86 Checked jitincompletehwintrinsic @dotnet-bot test Ubuntu x64 Checked jitincompletehwintrinsic |
|
|
@dotnet-bot test Windows_NT x64 Checked gcstress0x3 @dotnet-bot test Ubuntu x64 Checked gcstress0x3 |
|
|
|
PR Stress test failures look to be generally the same as appears in the normal stress test runs. A couple of them are HWIntrinsic and should be investigated further |
|
Nice work !! If I understand right this reduces our footprint on customer disks by 200KB? |
CarolEidt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with a requested comment enhancement.
| // | ||
| // Now output VEX prefix leaving the 4-byte opcode | ||
|
|
||
| if ((vexPrefix & 0xFFFF7F80) == 0x00C46100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really great!
src/jit/emitxarch.cpp
Outdated
| // (the 2-byte VEX encoding only supports the 0x0F leading byte). When these | ||
| // conditions are met, we can change byte-0 from 0xC4 to 0xC5 and then | ||
| // byte-1 is the logical-or of bit 7 from byte-1 and bits 0-6 from byte 2 | ||
| // from the 3-byte VEX encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might move this comment up above the check for the condition, and break down the "magic" constants that it uses so it's clearer that the condition matches the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do. I'm also going to wait to merge this until after @fiigii can review as well, as he may be able to double-check that I didn't miss some obscurity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved above and modified to be:
// The 2-byte VEX encoding, requires that the X and B-bits are set (these
// bits are inverted from the REX values so set means off), the W-bit is
// not set (this bit is not inverted), and that the m-mmmm bits are 0-0001
// (the 2-byte VEX encoding only supports the 0x0F leading byte). When these
// conditions are met, we can change byte-0 from 0xC4 to 0xC5 and then
// byte-1 is the logical-or of bit 7 from byte-1 and bits 0-6 from byte 2
// from the 3-byte VEX encoding.
//
// Given the above, the check can be reduced to a simple mask and comparison.
// * 0xFFFF7F80 is a mask that ignores any bits whose value we don't care about:
// * R can be set or unset (0x7F ignores bit 7)
// * vvvv can be any value (0x80 ignores bits 3-6)
// * L and be set or unset (0x80 ignores bit 2)
// * pp can be any value (0x80 ignores bits 0-1)
// * 0x00C46100 is a value that signifies the requirements listed above were met:
// * We must be a three-byte VEX opcode (0x00C4)
// * X and B must be set (0x61 validates bits 5-6)
// * m-mmmm must be 0-00001 (0x61 validates bits 0-4)
// * W must be unset (0x00 validates bit 7)
@danmosemsft Probably not. Crossgen only uses SSE encoding for hardware compatibility, so this work reduces our memory footprint on JITed managed code. |
| // | ||
| // Now output VEX prefix leaving the 4-byte opcode | ||
|
|
||
| if ((vexPrefix & 0xFFFF7F80) == 0x00C46100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I think it's a bad idea to have asserts in emitDispIns, at least those that don't directly relate to displaying the instruction. This is because we don't want to have new asserts appearing in a dump or disassembly. Better would be to add an emitCheckIns or something, that can always be invoked on the instruction when DEBUG is set.
briansull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
* Adding support for the 2-byte VEX encoding to the emitter * Relocating and expanding the comment which explains the two-byte VEX encoding check
…eclr#21453) * Adding support for the 2-byte VEX encoding to the emitter * Relocating and expanding the comment which explains the two-byte VEX encoding check Commit migrated from dotnet/coreclr@9fe1570
This updates the emitter to support the 2-byte VEX encoding. It does this by one simple change, which is updating
emitOutputRexOrVexPrefixIfNeededto check if all the requirements are met when it tries to emit the prefix.Resolves https://github.com/dotnet/coreclr/issues/19746