Skip to content

blueshift-gg/caliper

Repository files navigation

Caliper

Precision historical replay for Solana program upgrades.

Caliper turns historical Penrose/Delorean transaction fixtures into a locked, repeatable release gate. It loads the exact pre-state into LiteSVM, optionally patches a program ELF, executes the historical transaction, and compares status, post-accounts, return data, and compute units.

The important invariant is two-phase replay: every run first proves the locked historical fixture still replays exactly. Candidate execution only starts after that harness parity check passes.

Install

cargo install --path .
caliper --help

Or run directly from the checkout:

cargo run -- --help

Quickstart

The intended operator flow is:

export CALIPER_DELOREAN_RPC_URL='<org-delorean-rpc-url>'
cargo run -- check caliper.example.toml
cargo run -- lock caliper.example.toml --output .caliper/sample.lock.json
cargo run -- check caliper.example.toml --lock .caliper/sample.lock.json
cargo run -- run caliper.example.toml \
  --lock .caliper/sample.lock.json \
  --output .caliper/sample.run.json

If the service gives the RPC origin and key separately, use:

export CALIPER_DELOREAN_RPC_BASE_URL='<org-delorean-rpc-base-url>'
export CALIPER_DELOREAN_RPC_KEY='<org-delorean-rpc-key>'

The older CANARY_DELOREAN_RPC_* names are still accepted as compatibility aliases, but new setups should use CALIPER_DELOREAN_RPC_*.

Do not commit the credential-bearing RPC URL. Caliper redacts endpoint paths in new lockfiles, reports, and terminal output. Keep .caliper/ local; it contains fixtures, exported programs, run reports, and other generated artifacts.

What Is Locked

caliper lock resolves the corpus and writes a lockfile with:

  • the corpus hash, including the Penrose endpoint and corpus selector;
  • each transaction signature and slot;
  • each raw Penrose fixture hash;
  • discriminator coverage for discovered corpora.

caliper run refuses to use a lockfile built from a different corpus. The CU policy and candidate artifact can change without rebuilding the corpus, which lets the same locked fixtures test strict, budgeted, or explicit A/B policies. The corpus hash uses the redacted endpoint identity, so rotating a path token does not force a new lock for the same service origin.

Replay Gate

Every run starts with a historical fixture replay. If the fixture does not replay exactly, Caliper refuses candidate execution because the harness has not proved parity for that corpus.

Candidate modes:

[program]
id = "pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA"
current = true
dump_dir = ".caliper/programs/current"

current = true extracts the fixture-local historical ELF, dumps it, patches it back into the same fixture, and proves the replacement path is lossless. This is the right first self-test for a new program corpus.

[program]
id = "pAMMBay6oceH9fJKBRHGP5D4bD4sWpmSwMn52FMfXEA"
candidate_elf = "target/deploy/candidate.so"

candidate_elf swaps the locked fixture's deployed program for a candidate artifact and compares against the historical fixture expectation. Candidate and baseline paths must point at real ELF files; check fails before a run if an artifact is missing or does not have ELF magic bytes.

[program]
id = "TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA"
baseline_elf = ".caliper/programs/tokenkeg.spl-token-3.5.0.so"
candidate_elf = ".caliper/programs/tokenkeg.p-token.so"

baseline_elf + candidate_elf runs a three-phase A/B gate: historical fixture parity, explicit baseline replay, then candidate replay. Candidate compute units are compared against the explicit baseline program, while state, status, and return data still have to match the fixture. For A/B runs, CU policies apply to the replaced program's summed log-reported CU, not whole-transaction CU. This keeps program performance claims from being diluted by unrelated instructions in the same historical transaction.

Corpus Sources

explicit is the simplest and safest source:

[corpus]
source = "explicit"

[[corpus.transactions]]
signature = "WRM4dvtB32k261TTsn2nc4ini9VvWrB1NBLCQdtZk1FRycp54VTuaDikufLk1E2MANthZjQQS6skzeis2W3bvNA"
label = "reference-buy"

program-top-level discovers transactions with a normal RPC pass before it fetches Penrose fixtures:

  1. page getSignaturesForAddress(program_id);
  2. fetch normal RPC getTransaction in base64 form;
  3. decode top-level message instructions, including v0 loaded addresses;
  4. lock a Penrose fixture only when the signature fills a requested bucket.

This avoids fixture fetches for CPI-only or unrelated transactions. It is still an RPC history scan, so high-volume programs should specify discriminators and use before/until bounds for the target upgrade era.

By default, discovered corpora must fill every requested discriminator bucket. For smoke tests or exploratory scans, set:

[corpus]
allow_incomplete = true

Compute Policies

Compute units are exact by default. Candidate specs can choose an explicit budget:

[oracle.compute_units]
mode = "max-increase"
max_increase = 10

or an explicit performance target:

[oracle.compute_units]
mode = "max-ratio"
numerator = 1
denominator = 10

For A/B runs, max-ratio = 1/10 means the candidate program must consume at most 10% of the explicit baseline program's log-reported compute units for every locked transaction.

Useful Commands

cargo run -- check <SPEC> --lock <LOCK>
cargo run -- inspect <LOCK> --json .caliper/inspect.json
cargo run -- replay <SIGNATURE> --jsonl .caliper/replay.jsonl
cargo run -- export-program <SIGNATURE> <PROGRAM_ID> --output .caliper/program.so

check is the preflight command. It validates the spec, candidate artifacts, corpus hash, fixture cache, and coverage completeness before a run.

Development

Run the local gate before opening changes:

cargo fmt --check
cargo test --locked
cargo clippy --all-targets --all-features --locked -- -D warnings

Networked fixture discovery and replay require a Delorean/Penrose endpoint, but unit tests and local CLI validation should stay credential-free.

Design Notes

The fixture schema boundary is deliberately local and raw-byte oriented. Penrose and LiteSVM currently sit on different Solana dependency stacks, so Caliper decodes fixture data into stable bytes and converts into LiteSVM types only at the replay edge.

LiteSVM resolves versioned transactions through ALT accounts. Penrose fixtures provide the already-loaded writable and readonly ALT addresses. Caliper bridges that mismatch by synthesizing lookup table accounts from the transaction lookup indexes and the fixture's resolved ALT lists.

The default oracle normalizes only harness-level equivalences:

  • empty return data is empty even if the carrying program id differs;
  • missing zero-lamport post-accounts are equivalent to purged accounts.

Everything else is exact unless the spec opts into a compute-unit budget.

Current limitations:

  • fixtures with hash-only account blobs are rejected because external blob resolution is not implemented locally;
  • program extraction currently targets loader-v3 programdata fixtures;
  • discovery classifies top-level instructions only, not CPI-only coverage.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages