[NativeAOT] Move RequiresAlign8 flag from RareFlags into ExtendedFlags#106010
[NativeAOT] Move RequiresAlign8 flag from RareFlags into ExtendedFlags#106010MichalStrehovsky merged 5 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
|
Closing for now, the benchmark is flawed. |
|
Updated the OP with a fixed benchmark and more meaningful numbers. I'll will tinker a bit more with it. |
|
I rewrote the Benchmark results (on Raspberry Pi):
|
|
How do we store |
We don’t. We just check the element type of the array for |
|
If the getter is not supposed to be called on arrays, we can omit the HasComponentSizeFlag tricks from it and just test the single bit. |
|
The |
|
I would expect that string allocations to be always special cased. Where is the |
Presumably you are right. I checked all the code paths and couldn’t find one that would be hit with |
|
This should definitely run through the NativeAOT outerloop tests if the PR looks reasonable in other aspects. |
|
/azp run runtime-nativeaot-outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
A lot of tests on linux-arm crashing with stackoverflow: |
It was a very stupid mistake in the last change. |
|
/azp run runtime-nativeaot-outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Implementation of idea from https://github.com/dotnet/runtime/pull/105931/files#r1704319780.
The benchmark code and results are below. I tested it on Raspberry Pi. The tests were run in succession, CPU temperature was around 48 degrees, so no thermal throttling should be involved.
Predictably, the performance improves for `RequiresAlign8` cases where we don't need to decode the optional fields block. There's about 5% perf hit for non-align8 array allocations and 2% perf hit for non-align8 value type allocations. Some perf hit for arrays was expected, the other code path needs more investigation though as no hit is expected there.Updated benchmark results here: #106010 (comment)
This would make it easier to set the
RequiresAlign8flag for truncated method tables used inGCStaticEETypeNode.Benchmark code:
Benchmark results (on Raspberry Pi):