Open
Description
Summary
Add performance benchmarks as a type of test which measures changes in performance.
Background and Motivation
Pass/Fail is an inadequate way to measure the quality of a system. Some quality requirements, such as API response time, must be met and maintained for every software release. Even where there are not strict quality requirements, it's good to know whether or not a change has significantly impacted the performance of a feature (positively or negatively).
- It's common for application performance to degrade over time as we lack the tools to adequately measure the performance of software.
- Proper benchmarking software requires specialized tools and knowledge
- Performance problems may not be identified immediately, requiring forensic analysis of source control to determine the cause. We lack ways to "fail fast" with qualitative tests
Proposed Feature
- A new test type which focuses on measuring performance
- A tool to create a "baseline" against which to compare changes.
- Baselines should be specific to an environment & version (i.e. tag or commit)
- Support for different environments (local, dev, staging, qa)
- Historical tracking - important for tracking performance creep that doesn't trigger alarms release-to-release.
- Reporting
- Measurements (benchmark results) which can be tracked in source control
- Configurable benchmark-focused optimizations, such as warmup iterations
- Resource tracking (memory, CPU usage, I/O)
- Configurable thresholds for warnings & failures (static error %, or based on std. deviation)
- Micro-Environment (VM) configuration (mono, server GC, workstation GC)
- Macro-Environment (OS Host) configuration (linux, windows, remote deployments)
- Configurable concurrency
Alternative Designs
- Benchmark.net has implemented a lot of the heavy lifting for benchmarking, however it lacks a proper framework for automation. MSTest seems like the best way to bring this into the standard workflow.
- Most profiling tools exist exclusively in the CI/CD pipeline. Same as unit tests, performance tests as part of the pipeline should function as a back-stop guarantee, however validation should have been performed by a developer first.
- Most CI/CD tools measure performance at load / under stress. These are complex tests. Simple changes (e.g. unit tests, but for performance) would be a more pragmatic way of tracking performance changes over time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment