-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Reduce TP for targets with more than 64 Registers Part 2 : Make operations on fixedRegs less expensive #113713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
df9d60f
to
a91ef0b
Compare
56dc88c
to
134eea5
Compare
030b684
to
d2b16f9
Compare
d2b16f9
to
6db2ad5
Compare
97ea8dd
to
477aa53
Compare
477aa53
to
dc1d20a
Compare
@BruceForstall @kunalspathak This PR is ready for review. |
odd superpmi-replay failure. trying again. /azp run runtime-coreclr superpmi-replay |
/azp run runtime-coreclr superpmi-replay |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice TP wins!
/ba-g failures are timeout |
Rationale for this change
The reason why
updateNextFixedRef
is expensive is due to us repeatedly working onfixedRegs
which is aregMaskTP
. This change splitsfixedRegs
into 2SingleTypeRegSet
variablesfixedRegsLow
andfixedRegsHigh
. This necessitates some changes to how we call the impacted methods. ImplementingupdateNextFixedRef
as a templated method allows us to have 2 separate versions of this method without adding duplicate methodsChanges included here
fixedRegs
is split into 2 separateSingleTypeRegSet
on targets with more than 64 registersupdateNextFixedRef
implementation that works on the newly createdSingleTypeRegSet
updateNextFixedRefDispatch
which calls the requiredupdateNextFixedRef
based on register number.Impact of this change
This change should have minimal to no impact on targets with less than 64 registers. Local testing done with additional eGPR enabled(to simulate more than 64 registers) show the following results
Overall (-0.35% to -0.12%)
MinOpts (-1.09% to -0.31%)
FullOpts (-0.32% to -0.12%)
Will help with reducing regression in 'procesKills()` - #112962