Skip to content

Conversation

@WiseMrMusa
Copy link

This PR adds support for distributed proof generation using SP1 Cluster infrastructure, enabling multi-GPU proving through a gRPC API with Redis-based artifact storage.

Changes

zkvm-interface

  • Add ClusterProverConfig struct with configurable endpoint and Redis URL
  • Add Cluster variant to ProverResourceType enum
  • Implement to_args() for CLI argument generation

ere-sp1

  • Add cluster.proto for gRPC service definitions
  • Implement SP1ClusterClient with:
    • gRPC communication with cluster API
    • Redis artifact upload/download with zstd compression
    • Support for chunked artifact storage (large proofs)
    • Exponential backoff polling for proof completion
    • Automatic cleanup of uploaded artifacts
  • Extend Prover enum with Cluster variant
  • Add cluster-specific error types
  • Integrate cluster proving into EreSP1::prove() flow

Configuration

Environment Variable Default Description
SP1_CLUSTER_ENDPOINT http://172.17.0.1:50051/ gRPC endpoint of the cluster API
SP1_CLUSTER_REDIS_URL redis://:redispassword@172.17.0.1:6379/0 Redis URL for artifact storage

Since the cluster will be run as an independent docker, the default IP is the default docker gateway IP. when this is not the case, the two environmental variables needs to be set and will be picked up when loaded.

Add support for distributed proof generation using SP1 Cluster infrastructure.

Changes:
- Add ClusterProverConfig and Cluster variant to ProverResourceType
- Implement SP1ClusterClient with gRPC API and Redis artifact storage
- Support compressed artifacts with zstd and chunked downloads
- Add exponential backoff polling for proof completion
- Integrate cluster proving flow into EreSP1

Configuration:
- SP1_CLUSTER_ENDPOINT: gRPC endpoint (default: http://172.17.0.1:50051/)
- SP1_CLUSTER_REDIS_URL: Redis URL for artifacts
@WiseMrMusa WiseMrMusa force-pushed the nm/feat/sp1-cluster branch from 51df9cc to 3a969af Compare January 3, 2026 06:23
@WiseMrMusa WiseMrMusa force-pushed the nm/feat/sp1-cluster branch from 72903fe to ce6dfc5 Compare January 3, 2026 06:47
…mplementations

Changes:
- Updated panic messages to include Cluster resource type for EreOpenVM, EreRisc0, EreZiren, and EreZisk.
- Adjusted match patterns to handle both Network and Cluster resource types consistently across implementations.
@WiseMrMusa WiseMrMusa force-pushed the nm/feat/sp1-cluster branch from ce6dfc5 to 4bac87e Compare January 3, 2026 07:03
@WiseMrMusa WiseMrMusa force-pushed the nm/feat/sp1-cluster branch from 680cadb to e2bbae8 Compare January 7, 2026 22:14
- Source environment file after zkmup installation to ensure PATH is set
- Add explicit error handling for zkmup list-available command
- Add fallback logic to find toolchain directory
- Add debug output when toolchain path resolution fails

This fixes the CI failure where rustup toolchain link was called with
missing PATH argument due to zkmup not being in PATH after installation.
The Dockerfile.base for ziren already sets nightly as the default
toolchain via 'rustup default nightly', so using 'cargo +nightly'
is unnecessary and can fail with 'no such command' error in some
Docker environments where the +toolchain syntax isn't recognized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants