SP1 Hypercube baseline: prover instrumentation, load test fixes, and tooling#6155
SP1 Hypercube baseline: prover instrumentation, load test fixes, and tooling#6155avilagaston9 wants to merge 10 commits intomainfrom
Conversation
Adds a comprehensive roadmap for ethrex L2 development covering: - zkVM integration (ZisK, SP1, RISC0, OpenVM) - Native and based rollups - Security and audit work - Aligned Layer integration - Prover infrastructure - Sequencer improvements - Testing infrastructure - Shared bridge (cross-L2 communication) - Monitor and developer UX - Documentation - Contracts and standards compliance - Tech debt cleanup - L2Beat stage compliance tracking Items are organized by category with status tracking (Done, In Progress, Pending, Blocked, Research) and links to relevant issues/PRs.
… integration guide Reorganized the L2 roadmap into 8 main sections: zkVM Integration (ZisK, SP1 Hypercube, OpenVM), Native Rollups, Aligned Layer, Prover Infrastructure, Sequencer, Contracts, Security & Audit, and Based Rollups. Grouped smaller issues under general items like "Refactors & Cleanup" to keep the roadmap focused on high-level goals. Moved the SP1 Hypercube integration guide from docs/l2/ to docs/prover/ where it fits better with other prover documentation. Added entries to SUMMARY.md for both the L2 Roadmap and the SP1 Hypercube guide.
Add comprehensive native_rollups.md covering EIP-8079 EXECUTE precompile, benefits (governance-free upgrades, L1 security, reduced complexity), implementation phases, technical challenges, and ethrex integration roadmap. Expand L2 roadmap native rollups section with additional items (gap analysis, precompile PoC, ultra sound design) and accurate timeline noting EIP-8079 is still in Draft status, not scheduled for Glamsterdam or Hegota forks.
…ty Checks entry to reduce table verbosity while preserving all issue references
Switch the prover loop from prove() to prove_timed() so each batch logs a structured line with batch number and elapsed proving time (seconds and milliseconds). Add scripts/sp1_bench_metrics.sh that tails the prover log, collects results into a CSV, and prints a summary table on exit.
… of prover The guest program was moved from crates/l2/prover/src/ethrex_guest_program/ to crates/guest-program/, but the fallback VK paths in the deployer were not updated. This caused deploy-l1-sp1 to fail with "No such file or directory" when running outside Docker. CI was unaffected because it passes explicit VK paths via ETHREX_SP1_VERIFICATION_KEY_PATH in docker-compose.yaml.
…nce conflicts, and add --endless flag for continuous load generation. The load_test function now fetches the pending nonce (instead of latest) so re-runs pick up where the previous round left off. wait_until_all_included now tracks per-account target nonces instead of a flat tx_amount, which was incorrect for any run where the starting nonce was not zero.
🤖 Kimi Code ReviewSecurity ReviewNo critical or high-severity security issues identified. Bug ReviewMedium-severity bug found: File: The hard-coded relative paths in Recommendation: Verify the actual directory structure matches these new paths or revert to the original paths. Performance ReviewNo significant performance issues identified. The changes appear to be infrastructure improvements (SP1 Hypercube integration, native rollups documentation) rather than performance regressions. Automated review by Kimi (Moonshot AI) |
🤖 Claude Code ReviewNow let me verify that percentage calculation in load test is safe from division by zero. Based on my thorough review of this PR, here's my analysis: PR #6155 Review: SP1 Hypercube BaselineSummaryThis PR adds prover instrumentation for benchmarking, fixes deployer VK file paths, and improves the load test tool. The changes are mostly low-risk with one bug finding. Security VulnerabilitiesNone found. BugsMedium - Potential Division by Zero in Load Test Percentage CalculationFile: (nonce as f64 / target_nonce as f64) * 100.0Issue: If Impact: While unlikely in normal operation (since users typically specify Recommendation: Add a check: let percentage = if target_nonce > 0 {
(nonce as f64 / target_nonce as f64) * 100.0
} else {
0.0
};
println!(
"Waiting for transactions to be included from {encoded_src}. Nonce: {nonce}. Target: {target_nonce}. Percentage: {:.2}%.",
percentage
);Low - Bash Script Arithmetic Edge CaseFile: ((ms < min)) && min=$ms
((ms > max)) && max=$msIssue: If Impact: Minimal - edge case that shouldn't occur in practice. Recommendation: No action required, but could add explicit handling if defensive coding is desired. Significant Performance IssuesNone found. The changes actually improve performance observability by adding timing instrumentation. The load test improvements (using Additional Observations (Non-blocking)Positive Changes
DocumentationThe PR adds extensive documentation about SP1 Hypercube and Native Rollups. These are markdown files with no security implications. Code Quality
ConclusionOverall Assessment: The PR is in good shape with one medium-severity bug that should be fixed before merging. Required Changes:
Recommendation: Approve after fixing the division by zero issue. Automated review by Claude (Anthropic) |
🤖 Codex Code ReviewNo security vulnerabilities or significant performance issues found in the diff. Bugs
Automated review by OpenAI Codex |
Lines of code reportTotal lines added: Detailed view |
…ndpoint. The script now fetches batch_gas_used, batch_tx_count, and batch_size from the L2 metrics endpoint (localhost:3702/metrics) for each proved batch, joining them with proving time into a single CSV and summary table.
…ing. It now parses all proving_time lines from the file, fetches batch metadata from Prometheus, prints the table, and exits immediately.
…last] contains last - first + 1 blocks, not last - first. A single-block batch was reported as size 0.
|
Superseded by #6157 |
Motivation
Preparation for benchmarking SP1 proving performance before upgrading to SP1 Hypercube. This branch includes the tooling changes needed to run reproducible proving benchmarks on the L2.
Description
Prover instrumentation:
prove_timed()instead ofprove(), logging batch number and elapsed time with structured fieldsscripts/sp1_bench_metrics.shto parse prover logs and build a CSV + summary tableDeployer fix:
cmd/ethrex/l2/deployer.rsafter the guest program was moved fromcrates/l2/prover/src/ethrex_guest_program/tocrates/guest-program/Load test improvements:
BlockTag::Pendinginstead ofLatestwhen fetching the starting nonce, so consecutive runs don't restart from nonce 0wait_until_all_includednow tracks per-account target nonces instead of a flattx_amount--endlessflag for continuous load generation (infinite send+wait loop)Test plan
--mempool.maxsize 200000, run load test, verify txs are included--endlessflag, verify it loops automatically