Tool for exploring performance by varying JIT behavior #381
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Initial version of a tool that can run BenchmarkDotNet (BDN) over a set of benchmarks in a feedback loop. The tool can vary JIT behavior, observe the impact this modification on jitted code or benchmark perf, and then plan and try out further variations in pursuit of some goal (say higher perf, or smaller code, etc).
Requires access to InstructionsRetiredExplorer as a helper tool, for parsing the ETW that BDN produces. Also requires a local enlistment of the performance repo. You will need to modify file paths within the source to adapt all this to your local setup. Must be run with admin priveleges so that BDN can collect ETW.
The only supported variation right now is modification of which CSEs we allow the JIT to perform for the hottest Tier-1 method in each benchmark. If a benchmark does not have a sufficiently hot Tier-1 method, then it is effectively left out of the experiment.
The experiments on each benchmark are prioritized to explore variations in performance for subsets of currently performed CSEs. For methods with many CSEs we can realistically afford to only explore a small fraction of all possibilities. So we try and bias the exploration towards CSEs that have higher performance impacts.
Results are locally cached so that rerunning the tool will not rerun experiments.
Experiments are summarized by CSV file with a schema that lists benchmark name, number of CSEs, code size, perf score, and perf.