-
Notifications
You must be signed in to change notification settings - Fork 2.6k
JIT: Add BinaryTrees benchmark variant #8867
Conversation
Add a variant that uses a class and out parameter instead of returning a struct by value. This variant is similar to version 3 from the benchmarks games site, but with validation added and parallelism removed. See related analysis in #8837. According to xunit-perf runs, this version's performance is improved (~10%) by enabling the model inlining policy. When the model policy is enabled the inliner will inline the two outermost calls to `ChildTreeNodes` in the innermlost loop. Also, make sure the new and the original version to build the same way in release and debug.
@sivarv @dotnet/jit-contrib PTAL Siva, over in #8837 you didn't mention the relative perf impact from the model policy -- does 10% seem about right? I didn't see any difference in GC counts -- haven't looked at GC times yet. |
I have measured benchmark numbers with model policy on desktop. Here are the numbers on my machine Jit64: 12880 In percentage terms it is 8.85% |
10% improvement with model policy seems about right. |
Looks good. |
Another point: On Desktop cqPerf script is exercising this benchmark with a depth argument of 20 |
Correction: model policy benefit is more like 6% at depth 16. If I increase depth to 20 the benefit looks similar to what you measured on desktop.
|
Retrying the arm legs which failed more or less immediately... @dotnet-bot retest Linux ARM Emulator Cross Debug Build |
@dotnet-bot retest Linux ARM Emulator Cross Debug Build |
Am going to close/reopen and see if that cleans up the arm CI... |
Trying to use this to validate #8875; evidently there's a time lag between when netci.groovy changes and the jobs get updated. Will retry in a few minutes. |
@dotnet-bot retest Linux ARM Emulator Cross Debug Build |
…riant JIT: Add BinaryTrees benchmark variant Commit migrated from dotnet/coreclr@ec54d18
Add a variant that uses a class and out parameter instead of returning
a struct by value. This variant is similar to version 3 from the benchmarks
games site, but with validation added and parallelism removed.
See related analysis in #8837. According to xunit-perf runs, this version's
performance is improved (~10%) by enabling the model inlining policy. When
the model policy is enabled the inliner will inline the two outermost calls
to
ChildTreeNodes
in the innermlost loop.Also, make sure the new and the original version to build the same way in
release and debug.