-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Today https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/jit/lsra-heuristic-tuning.md documents that it does FREE and then CONST_AVAILABLE. This ends up not being able to reuse an already enregistered constant if it wasn't "last use".
A simple example is the following where C81 and C82 both represent the CNS_DBL of 0.0:
68.#46 C81 Def ORDER(A) mm0 | |V13 a| |V2 a|V3 a|V0 a|V1 a| | |V10 a|V4 a|V7 a|C81 a| | |V101a| |
70.#47 C82 Def ORDER(A) mm1 | |V13 a| |V2 a|V3 a|V0 a|V1 a| | |V10 a|V4 a|V7 a|C81 a|C82 a| |V101a| |
While these values aren't last use, the last use does happen before the constant is overwritten and in some cases we can end up with 4-5 registers all being initialized (as is the case for Matrix4x4.Decompose) to the same constant value (often zero so emitting xorps many times in a row).
It would be beneficial if the register allocator could improve this scenario and keep the constant in one register for such scenarios. This in particular appears to impact Zero and AllBitsSet since they are "cheap to compute" and so typically don't undergo CSE.
category:cq
theme:register-allocator