You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
POWER10 adds the ability to perform PC-relative loads and stores, which renders the pTOC (where we previously stored floating-point and 64-bit constants) redundant, as all pTOC loads could theoretically be replaced with PC-relative loads instead. The current high-level plan to achieve this in the short-term is to change the sequence used to load a value from a ConstantDataSnippet to instead use a prefixed PC-relative load which is patched with the correct displacement during the EmitSnippets phase. Anything that already uses ConstantDataSnippet can then get improved performance for free and any locations that would previously use a pTOC load can be replaced with use of ConstantDataSnippet when running on POWER10.
We can also further improve the situation by using the 34-bit displacement field of a prefixed load/store instruction (with a constant base of 0) to directly encode an address into a load/store if it is statically known to be sufficiently small to fit. This completely eliminates the need for an extra layer of indirection, which should be quite beneficial for performance.
In order to accomplish this, the following changes will need to be made:
Implement binary encoding and relocation for PC-relative loads/stores
Implement exploitation of prefixed loads/stores in MemoryReference::expandInstruction
Disable out-of-range displacement handling code in MemoryReference::addToOffset on P10
Improve ConstantDataSnippet load sequence
Switch to emitting PC-relative loads when using ConstantDataSnippet on P10
Implement patching of PC-relative load displacements in ConstantDataSnippet
Modify MemoryReference::accessStaticItem to use ConstantDataSnippet instead of pTOC loads on P10
Add a special case to MemoryReference::accessStaticItem to use the 34-bit displacement field for statically-known small addresses
Find and improve misc. locations that access the pTOC directly via TR_PPCTableOfConstants, e.g. [1]
Completely disable the pTOC on P10 (free register, revamp trampolines, etc.)
POWER10 adds the ability to perform PC-relative loads and stores, which renders the pTOC (where we previously stored floating-point and 64-bit constants) redundant, as all pTOC loads could theoretically be replaced with PC-relative loads instead. The current high-level plan to achieve this in the short-term is to change the sequence used to load a value from a
ConstantDataSnippet
to instead use a prefixed PC-relative load which is patched with the correct displacement during theEmitSnippets
phase. Anything that already usesConstantDataSnippet
can then get improved performance for free and any locations that would previously use a pTOC load can be replaced with use ofConstantDataSnippet
when running on POWER10.We can also further improve the situation by using the 34-bit displacement field of a prefixed load/store instruction (with a constant base of 0) to directly encode an address into a load/store if it is statically known to be sufficiently small to fit. This completely eliminates the need for an extra layer of indirection, which should be quite beneficial for performance.
In order to accomplish this, the following changes will need to be made:
MemoryReference::expandInstruction
MemoryReference::addToOffset
on P10ConstantDataSnippet
load sequenceConstantDataSnippet
on P10ConstantDataSnippet
MemoryReference::accessStaticItem
to useConstantDataSnippet
instead of pTOC loads on P10MemoryReference::accessStaticItem
to use the 34-bit displacement field for statically-known small addressesTR_PPCTableOfConstants
, e.g. [1][1] https://github.com/eclipse/omr/blob/31fae2884bc015929925dcaeb6f7018e851fdf4a/compiler/p/codegen/ControlFlowEvaluator.cpp#L2996
The text was updated successfully, but these errors were encountered: