burn-bench
is a benchmarking repository for Burn. It helps
track performance across different hardware and software configurations, making it easier to
identify regressions, improvements, and the best backend for a given workload.
crates/backend-comparison/
: Benchmarks for backend performance, ranging from individual tensor operations to full forward and backward passes for a given model.crates/burnbench/
: The core benchmarking crate and CLI. Can be used as a standalone tool or integrated as a library to define and run custom benchmark suites.- (Future)
crates/integration-tests/
: TBD. We'd like to add more tests to capture more complex workloads, including evaluation of model convergence, metrics, and overall training performance.
To run backend performance benchmarks, use the burnbench
CLI:
cargo run --release --bin burnbench -- run --benches unary --backends wgpu-fusion
Or use the shorthand alias:
cargo bb run -b unary -B wgpu-fusion
This will use the main branch of Burn by default.
To benchmark performance across version(s):
cargo bb run -b unary -B wgpu-fusion -V 0.18.0 main local
You can specify one or more versions and provide custom burnbench
arguments to benchmark them.
The versions can be one of:
- Published version (e.g.,
0.18.0
) - Git branch (e.g.,
main
) - Git commit hash
local
By default, the local
version points to a relative path for the Burn repo directory (../../burn
relative to backend-comparison/
). This can be modified via the BURN_BENCH_BURN_DIR
environment
variable.
For detailed instructions, see crates/burnbench/README.md
and
crates/backend-comparison/README.md
.
Burn supports sharing benchmark results to help users compare hardware and backend performance. Results are published at burn.dev/benchmarks.
To contribute benchmarks, authenticate using:
cargo run --release --bin burnbench -- auth
Then share results with:
cargo bb run --share --benches unary --backends wgpu-fusion
To develop burn-bench
using your local development stack (including the benchmark server and website),
use the alias cargo bbd
instead of cargo bb
.
This alias builds burn-bench
in debug mode and automatically points it to local endpoints.
You can trigger benchmark execution on-demand in a pull request by adding the label ci:benchmarks.
The parameters passed to burn-bench are defined in a benchmarks.toml file located at the root of the pull request’s repository.
Below is an example of such a file. Most fields are self-explanatory:
[environment]
gcp_gpu_attached = true
gcp_image_family = "tracel-ci-ubuntu-2404-amd64-nvidia"
gcp_machine_type = "g2-standard-4"
gcp_zone = "us-east1-c"
repo_full = "tracel-ai/burn"
rust_toolchain = "stable"
rust_version = "stable"
[burn-bench]
backends = ["wgpu"]
benches = ["matmul"]
dtypes = ["f32"]
The following diagram outlines the sequence of steps involved in executing benchmarks:
sequenceDiagram
actor Developer
participant PR as GitHub Pull Request
participant CI as Tracel CI Server
participant W as burn-bench Workflow
participant GCP as Google Cloud Platform
participant BB as burn-bench Runner
participant ORG as GitHub Organization
Developer->>PR: Add label "ci:benchmarks"
PR-->>CI: 🪝 Webhook "labeled"
CI->>PR: 💬 "Benchmarks Status (enabled)" 🟢
CI->>PR: Read file "benchmarks.toml"
CI->>PR: 💬 Read file error if any (end of sequence) ❌
CI->>W: Dispatch "burn-bench" workflow
W-->>CI: 🪝 Webhook "job queued"
CI->>GCP: 🖥️ Provision GitHub runners
GCP->>BB: Spawn instances
BB->>ORG: Register runners
ORG->>W: Start workflow matrix job (one per machine type)
W->>W: Write temporary `inputs.json`
W->>BB: 🔥 Execute benches with `inputs.json`
BB-->>CI: 🪝 Webhook "started" (first machine only)
CI->>PR: 💬 "Benchmarks Started"
BB->>BB: Run benchmarks
BB-->>CI: 🪝 Webhook "completed" (with data from `inputs.json`)
CI->>PR: 💬 "Benchmarks Completed" ✅
Note right of PR: End of sequence
Developer->>PR: Remove label "ci:benchmarks"
PR-->>CI: 🪝 Webhook "unlabeled"
CI->>PR: 💬 "Benchmarks Status (disabled)" 🔴
Note right of PR: End of sequence
Developer->>PR: Open pull request with "ci:benchmarks"
PR-->>CI: 🪝 Webhook "opened"
CI->>PR: Start sequence at [Read file "benchmarks.toml"]
Note right of PR: End of sequence
Developer->>PR: Update code with 🟢
PR-->>CI: 🪝 Webhook "synchronized"
CI->>PR: Restart sequence at [Read file "benchmarks.toml"]
Note right of PR: End of sequence
Developer->>PR: Merge pull request into main with 🟢
PR-->>CI: 🪝 Webhook "closed"
CI->>PR: Start sequence at [Read file "benchmarks.toml"] without the 💬 tasks
Note right of PR: End of sequence
You can also manually execute the [benchmarks.yml workflow][] via the GitHub Actions UI.
When triggering it manually, you’ll need to fill in the required input fields. Each field includes a default value, making them self-explanatory.
We welcome contributions to improve benchmarking coverage and add new performance tests.