Production-validation-related scripts currently under scripts/:
prod_test.sh: correctness and crash-recovery test orchestration (fast/stress/chaos/all)perf_gate.sh: performance snapshot and regression gate (snapshot/compare)perf_thresholds.env: default template for performance gate thresholds and quick benchmark knobs
Unified layered execution for production-style integration tests:
fast: quick regression (default)stress: long-run/high-pressure cases (#[ignore])chaos: failpoint crash/fault-injection cases (#[ignore])all:fast + stress + chaos
./scripts/prod_test.sh [mode] [threads]mode:fast|stress|chaos|all, defaultfastthreads: passed tocargo test -- --test-threads=..., default8
Examples:
./scripts/prod_test.sh fast 8
./scripts/prod_test.sh all 8FEATURES: defaultfailpointsFAST_TIMEOUT/STRESS_TIMEOUT/CHAOS_TIMEOUTCHAOS_RETRY: retry count for chaos casesMACE_PROD_BUCKET_STRESS_ROUNDSMACE_PROD_BUCKET_CHURN_ROUNDSMACE_PROD_BUCKET_CHURN_WORKERSMACE_PROD_EVICTOR_STRESS_ROUNDS
- Exit code
0: full pass - Non-zero: failed items are listed in
failuresat the end
Note: in chaos mode, child-process failure logs can be expected; rely on final script exit code and
failuressummary as ground truth.
Provides baseline sampling and regression gating:
snapshot: capture performance snapshot for current branch onlycompare: compare current branch against a baseline (defaultorigin/master) and enforce threshold rules
./scripts/perf_gate.sh snapshot
./scripts/perf_gate.sh compareMACE_PERF_THRESHOLD_FILE: threshold file, defaultscripts/perf_thresholds.envMACE_PERF_BASE_REF: baseline ref, defaultorigin/masterMACE_PERF_BENCH_ARGS: arguments forwarded to CriterionMACE_PERF_COMPARE_FILTERS: case filters for compare mode (comma-separated)MACE_PERF_REGRESS_PCT: per-case regression threshold (%)MACE_PERF_MAX_REGRESS_CASES: max allowed regressed case countMACE_PERF_MIN_COMPARE_CASES: minimum valid comparable case countMACE_PERF_MIN_BASE_NS: ignores ultra-small baseline noise
Default output directory: target/perf_gate
Common files:
base_summary.json/head_summary.jsonhead_snapshot.jsonbase.bench.log/head.bench.log/snapshot.bench.log
Core metric fields:
median_ns: Criterion median point estimate in nanoseconds (lower is better)mean_ns: Criterion mean point estimate in nanoseconds (lower is better)delta_pct(from compare output):- formula:
(head - base) / base * 100% - positive means slower, negative means faster
- formula:
Gate rule:
- compare fails when count of cases with
delta_pct > MACE_PERF_REGRESS_PCTis greater thanMACE_PERF_MAX_REGRESS_CASES.
Central place for default perf gate and quick benchmark parameters, reused locally and in CI.
- First run 3–5 rounds on target production-equivalent hardware to establish a stable baseline
- Then tighten:
MACE_PERF_REGRESS_PCTMACE_PERF_MAX_REGRESS_CASESMACE_PERF_COMPARE_FILTERS
- If CI noise is high, start with looser thresholds and narrower case filters, then tighten gradually
.github/workflows/ci.ymltest-prod-all:./scripts/prod_test.sh all 8perf-regression:./scripts/perf_gate.sh compare