Precision historical replay for Solana program upgrades.
Caliper turns historical Penrose/Delorean transaction fixtures into a locked, repeatable release gate. It loads the exact pre-state into LiteSVM, optionally patches a program ELF, executes the historical transaction, and compares status, post-accounts, return data, and compute units.
The important invariant is two-phase replay: every run first proves the locked historical fixture still replays exactly. Candidate execution only starts after that harness parity check passes.
cargo install --path .
caliper --helpOr run directly from the checkout:
cargo run -- --helpThe intended operator flow is:
export CALIPER_DELOREAN_RPC_URL='<org-delorean-rpc-url>'
cargo run -- check caliper.example.toml
cargo run -- lock caliper.example.toml --output .caliper/sample.lock.json
cargo run -- check caliper.example.toml --lock .caliper/sample.lock.json
cargo run -- run caliper.example.toml \
--lock .caliper/sample.lock.json \
--output .caliper/sample.run.jsonIf the service gives the RPC origin and key separately, use:
export CALIPER_DELOREAN_RPC_BASE_URL='<org-delorean-rpc-base-url>'
export CALIPER_DELOREAN_RPC_KEY='<org-delorean-rpc-key>'The older CANARY_DELOREAN_RPC_* names are still accepted as compatibility
aliases, but new setups should use CALIPER_DELOREAN_RPC_*.
Do not commit the credential-bearing RPC URL. Caliper redacts endpoint paths in
new lockfiles, reports, and terminal output. Keep .caliper/ local; it contains
fixtures, exported programs, run reports, and other generated artifacts.
caliper lock resolves the corpus and writes a lockfile with:
- the corpus hash, including the Penrose endpoint and corpus selector;
- each transaction signature and slot;
- each raw Penrose fixture hash;
- discriminator coverage for discovered corpora.
caliper run refuses to use a lockfile built from a different corpus. The CU
policy and candidate artifact can change without rebuilding the corpus, which
lets the same locked fixtures test strict, budgeted, or explicit A/B policies.
The corpus hash uses the redacted endpoint identity, so rotating a path token
does not force a new lock for the same service origin.
Every run starts with a historical fixture replay. If the fixture does not replay exactly, Caliper refuses candidate execution because the harness has not proved parity for that corpus.
Candidate modes:
[program]
id = "pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA"
current = true
dump_dir = ".caliper/programs/current"current = true extracts the fixture-local historical ELF, dumps it, patches it
back into the same fixture, and proves the replacement path is lossless. This is
the right first self-test for a new program corpus.
[program]
id = "pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA"
candidate_elf = "target/deploy/candidate.so"candidate_elf swaps the locked fixture's deployed program for a candidate
artifact and compares against the historical fixture expectation. Candidate and
baseline paths must point at real ELF files; check fails before a run if an
artifact is missing or does not have ELF magic bytes.
[program]
id = "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"
baseline_elf = ".caliper/programs/tokenkeg.spl-token-3.5.0.so"
candidate_elf = ".caliper/programs/tokenkeg.p-token.so"baseline_elf + candidate_elf runs a three-phase A/B gate: historical fixture
parity, explicit baseline replay, then candidate replay. Candidate compute units
are compared against the explicit baseline program, while state, status, and
return data still have to match the fixture.
For A/B runs, CU policies apply to the replaced program's summed log-reported
CU, not whole-transaction CU. This keeps program performance claims from being
diluted by unrelated instructions in the same historical transaction.
explicit is the simplest and safest source:
[corpus]
source = "explicit"
[[corpus.transactions]]
signature = "WRM4dvtB32k261TTsn2nc4ini9VvWrB1NBLCQdtZk1FRycp54VTuaDikufLk1E2MANthZjQQS6skzeis2W3bvNA"
label = "reference-buy"program-top-level discovers transactions with a normal RPC pass before it
fetches Penrose fixtures:
- page
getSignaturesForAddress(program_id); - fetch normal RPC
getTransactionin base64 form; - decode top-level message instructions, including v0 loaded addresses;
- lock a Penrose fixture only when the signature fills a requested bucket.
This avoids fixture fetches for CPI-only or unrelated transactions. It is still
an RPC history scan, so high-volume programs should specify discriminators and
use before/until bounds for the target upgrade era.
By default, discovered corpora must fill every requested discriminator bucket. For smoke tests or exploratory scans, set:
[corpus]
allow_incomplete = trueCompute units are exact by default. Candidate specs can choose an explicit budget:
[oracle.compute_units]
mode = "max-increase"
max_increase = 10or an explicit performance target:
[oracle.compute_units]
mode = "max-ratio"
numerator = 1
denominator = 10For A/B runs, max-ratio = 1/10 means the candidate program must consume at
most 10% of the explicit baseline program's log-reported compute units for every
locked transaction.
cargo run -- check <SPEC> --lock <LOCK>
cargo run -- inspect <LOCK> --json .caliper/inspect.json
cargo run -- replay <SIGNATURE> --jsonl .caliper/replay.jsonl
cargo run -- export-program <SIGNATURE> <PROGRAM_ID> --output .caliper/program.socheck is the preflight command. It validates the spec, candidate artifacts,
corpus hash, fixture cache, and coverage completeness before a run.
Run the local gate before opening changes:
cargo fmt --check
cargo test --locked
cargo clippy --all-targets --all-features --locked -- -D warningsNetworked fixture discovery and replay require a Delorean/Penrose endpoint, but unit tests and local CLI validation should stay credential-free.
The fixture schema boundary is deliberately local and raw-byte oriented. Penrose and LiteSVM currently sit on different Solana dependency stacks, so Caliper decodes fixture data into stable bytes and converts into LiteSVM types only at the replay edge.
LiteSVM resolves versioned transactions through ALT accounts. Penrose fixtures provide the already-loaded writable and readonly ALT addresses. Caliper bridges that mismatch by synthesizing lookup table accounts from the transaction lookup indexes and the fixture's resolved ALT lists.
The default oracle normalizes only harness-level equivalences:
- empty return data is empty even if the carrying program id differs;
- missing zero-lamport post-accounts are equivalent to purged accounts.
Everything else is exact unless the spec opts into a compute-unit budget.
Current limitations:
- fixtures with hash-only account blobs are rejected because external blob resolution is not implemented locally;
- program extraction currently targets loader-v3 programdata fixtures;
- discovery classifies top-level instructions only, not CPI-only coverage.