-
Couldn't load subscription status.
- Fork 3
Fully revamp Firehose docs structure #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Restructure SUMMARY.md to clearly separate chain-agnostic (90%) from chain-specific (10%) content - Create new Core Firehose section with architecture, CLI reference, and deployment guides - Add Chain-Specific Implementations section with standardized structure for each blockchain - Create comprehensive CLI reference documentation for firecore binary - Add detailed deployment guide targeting network operators - Create supported chains overview with binary usage patterns and requirements - Add quick start guide for getting Firehose running in under 30 minutes - Create integration template for adding new blockchain support - Add system requirements documentation for production deployments - Reorganize existing content to fit new structure while preserving valuable information - Focus on CLI flags over configuration files as requested - Target network operators and deployment-focused users This addresses the major revamp requested in BLO-537 to better organize Firehose documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a gitbook documentation project, so we need to follow Gitbook structure, re-work you PR to follow gitbook documentation structure.
Ensure the SUMMARY.md file is properly updated so I can preview the documentation update properly.
- Update SUMMARY.md to follow proper GitBook format and navigation - Restructure sections with correct GitBook hierarchy: - Getting Started (with Quick Start Guide) - Core Firehose (Chain-Agnostic) - Chain-Specific Implementations - Community Integrations - Integrate New Chains - Create architecture/README.md for proper GitBook navigation - Maintain existing file references while improving structure - Ensure SUMMARY.md enables proper GitBook preview functionality This addresses the review feedback to follow GitBook documentation structure.
| ### Distributed Deployment | ||
| Components spread across multiple machines for production scale: | ||
|
|
||
| ``` | ||
| ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ | ||
| │ Blockchain │ │ Firehose │ │ Storage & │ | ||
| │ Nodes │ │ Processing │ │ Serving │ | ||
| ├─────────────────┤ ├─────────────────┤ ├─────────────────┤ | ||
| │ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │ | ||
| │ │ Node 1 │ │ │ │ Reader 1 │ │ │ │ Storage │ │ | ||
| │ │ Node 2 │─┼────┼─│ Reader 2 │─┼────┼─│ (Cloud) │ │ | ||
| │ │ Node 3 │ │ │ │ Merger │ │ │ │ │ │ | ||
| │ └─────────────┘ │ │ │ Relayer │ │ │ └─────────────┘ │ | ||
| │ │ │ └─────────────┘ │ │ ┌─────────────┐ │ | ||
| │ │ │ │ │ │ gRPC Server │ │ | ||
| │ │ │ │ │ │ (Load Bal) │ │ | ||
| │ │ │ │ │ └─────────────┘ │ | ||
| └─────────────────┘ └─────────────────┘ └─────────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blockchain nodes are run as a subprocess of Reader node, can you make that more apparent somehow, either in the diagram or as text information.
Also, replace the gRPC Server By Firehose & Substreams and Load Bal by via gRPC.
| ### Streaming API | ||
| - gRPC-based streaming interface | ||
| - Real-time and historical data access | ||
| - Filtering and transformation capabilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fork aware and cursoring are also important element.
chains/supported-chains.md
Outdated
| Firehose supports a wide range of blockchain networks through a combination of universal components and chain-specific reader implementations. This page provides an overview of all supported chains and their specific characteristics. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using wide range, let's talk more about any blockchain for which a Firehose enabled node's client exists.
core/cli-reference.md
Outdated
| - `--config-file, -c` (string): Configuration file to use (default: `./firehose.yaml`) | ||
|
|
||
| ### Logging | ||
| - `--log-format` (string): Format for logging to stdout (`text` or `stackdriver`, default: `text`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document also that if Docker or Kubernetes execution environment, the default value switches to stackdriver (JSON format).
core/cli-reference.md
Outdated
| - **`firehose`** - Serves gRPC API for block streaming | ||
| - **`substreams-tier1`** - Substreams execution tier 1 | ||
| - **`substreams-tier2`** - Substreams execution tier 2 | ||
| - **`index-builder`** - Builds block indexes (if supported by chain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only Ethereum & NEAR
chains/ethereum/README.md
Outdated
| ## Storage Requirements | ||
|
|
||
| ### Mainnet | ||
| - **One-block files**: ~2GB/day | ||
| - **Merged blocks**: ~50GB/month | ||
| - **Full archive**: ~2TB/year | ||
|
|
||
| ### Testnets | ||
| - **Goerli**: ~10GB/month | ||
| - **Sepolia**: ~5GB/month | ||
|
|
||
| ## Performance Characteristics | ||
|
|
||
| ### Block Processing | ||
| - **Average block time**: 12 seconds | ||
| - **Processing latency**: <1 second | ||
| - **Throughput**: ~7,000 transactions/block | ||
|
|
||
| ### Resource Usage | ||
| - **CPU**: 2-4 cores recommended | ||
| - **Memory**: 8GB minimum, 16GB recommended | ||
| - **Storage**: SSD required for optimal performance | ||
| - **Network**: 100Mbps+ for real-time sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove requirements, simply tell operators to refer to the chain's official documentation they target, Firehose is a dummy reader on top of node's client, so operators should always refer to node's official documentation for how to properly operate the node's client software.
chains/ethereum/README.md
Outdated
| ## Troubleshooting | ||
|
|
||
| ### Common Issues | ||
|
|
||
| #### Node Sync Problems | ||
| ```bash | ||
| # Check node sync status | ||
| fireeth tools check-node-sync --node-url=http://localhost:8545 | ||
| ``` | ||
|
|
||
| #### Block Processing Delays | ||
| ```bash | ||
| # Monitor processing pipeline | ||
| fireeth tools monitor-pipeline --data-dir=/var/firehose-data | ||
| ``` | ||
|
|
||
| #### Storage Issues | ||
| ```bash | ||
| # Verify block file integrity | ||
| fireeth tools verify-blocks --start-block=1000000 --stop-block=1001000 | ||
| ``` | ||
|
|
||
| ## Migration from Other Systems | ||
|
|
||
| ### From Graph Node | ||
| - Export existing subgraph mappings | ||
| - Convert to Substreams modules | ||
| - Test with historical data | ||
| - Deploy to production | ||
|
|
||
| ### From Custom Indexers | ||
| - Identify data extraction patterns | ||
| - Map to Firehose block structure | ||
| - Implement using Substreams | ||
| - Validate data consistency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all this.
chains/supported-chains.md
Outdated
| #### Cosmos Ecosystem | ||
| - **[Injective](injective/README.md)** - Decentralized exchange protocol | ||
| - **Osmosis** - AMM protocol in Cosmos | ||
| - **Juno** - Smart contract platform | ||
|
|
||
| ### Community Supported | ||
|
|
||
| These chains are maintained by the community with StreamingFast guidance: | ||
|
|
||
| - **[Starknet](../community-integrations/starknet/README.md)** - Layer 2 scaling solution | ||
| - **Aptos** - Move-based blockchain | ||
| - **Sui** - Move-based blockchain | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all this, outdated and not true anymore.
SUMMARY.md
Outdated
| * [Injective](firehose-setup/injective/README.md) | ||
| * [Single-Machine Deployment](firehose-setup/injective/single-machine-deployment.md) | ||
|
|
||
| ## Community Integrations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this section as well as all documentastion under community-integration
SUMMARY.md
Outdated
| * [CLI Reference](core/cli-reference.md) | ||
| * [Deployment Guide](core/deployment-guide.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put Deployment guide before CLI reference.
- Remove Getting Started section completely - Remove Community Integrations section and all community-integration docs - Remove Integration Template file - Put Deployment Guide before CLI Reference in SUMMARY.md - Update architecture diagrams to show nodes as subprocess of Reader - Replace 'gRPC Server' with 'Firehose & Substreams' in diagrams - Add GitBook hint about node subprocess relationship - Add fork-aware and cursor-based streaming features - Remove CLI Reference temporarily (will recreate with correct info) This addresses the major structural feedback from the review.
- Clean up Ethereum documentation: only support Geth and Geth forks - Add proper GitBook hints throughout documentation - Create new CLI reference with correct environment variable patterns - Fix deployment guide to use flags instead of config files - Add firecore vs fireeth explanation with info hints - Update system requirements with correct default ports - Remove log-to-file recommendations - Add proper port information for Firehose & Substreams gRPC endpoint This addresses the specific technical feedback from the review.
- Rewrite deployment guide root page with chain-agnostic focus - Add dummy-blockchain as example implementation - Create comprehensive Single Machine Deployment guide: * All components in single process with shared local storage * Step-by-step verification with inspection commands * Proper backlinks to architecture documentation - Create detailed Distributed Deployment guide: * Each component as separate process * Shared object storage configuration * Production considerations and scaling guidance * Health checks and monitoring examples - Remove System Requirements from Core Firehose section - Update SUMMARY.md with new deployment structure Both guides use dummy-blockchain as chain-agnostic example that can be applied to any Firehose-enabled blockchain.
Single Machine Deployment fixes: - Add --config-file="" flag to prevent config file conflicts - Remove incorrect --grpc-listen-addr flag - Update all port references to correct defaults: * Firehose: :10015 (was :9000) * Substreams Tier1: :10016 (was :9000) * Substreams Tier2: :10017 (was :9000) - Fix print command syntax: * one-blocks → one-block (singular) * Remove trailing slashes from paths * Use block number 1 instead of 0000000001 - Add comprehensive prerequisites section with port conflict troubleshooting - Add default ports reference table - Add protobuf output explanation Distributed Deployment fixes: - Add --config-file="" flag to all components - Fix all print commands (one-blocks → one-block, remove trailing slashes) - Update all port references to correct defaults: * Firehose: :10015 (was :9000) * Substreams Tier1: :10016 (was :9001) * Substreams Tier2: :10017 (was :9002) - Fix health check commands with correct ports - Fix nginx configuration example with correct ports - Update all gRPC test commands to use correct ports Both guides now work out-of-the-box with accurate technical details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codegen-sh Here another pass of review
I'll review your comments and suggestions now! 📝
💻 View my work • 🛑 Stop
| # Create working directory | ||
| mkdir firehose-deployment | ||
| cd firehose-deployment | ||
|
|
||
| # Create data directory | ||
| mkdir -p firehose-data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to create firehose-data, it's automatically created by firecore, find another name for firehose-deployment also as inside there will be firehose-data folder, so it's a bit redundant.
| --advertise-chain-name="acme-dummy-blockchain" \ | ||
| --reader-node-path="dummy-blockchain" \ | ||
| --reader-node-data-dir="./firehose-data/reader-node" \ | ||
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=120 --genesis-height=0 --genesis-block-burst=100" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=120 --genesis-height=0 --genesis-block-burst=100" | |
| --reader-node-arguments="start --tracer=firehose --store-dir={data-dir}/reader --block-rate=120" |
| **Default Ports Used:** | ||
| - **Firehose**: `:10015` (main gRPC API) | ||
| - **Reader**: `:10010` | ||
| - **Relayer**: `:10014` | ||
| - **Merger**: `:10012` | ||
| - **Substreams Tier1**: `:10016` | ||
| - **Substreams Tier2**: `:10017` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also describe quickly the protocol for each of those port and ideally link to Protobuf service definition.
| - **Substreams Tier1**: `:10016` | ||
| - **Substreams Tier2**: `:10017` | ||
|
|
||
| The `--config-file=""` flag disables automatic config file loading to prevent conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The `--config-file=""` flag disables automatic config file loading to prevent conflicts. | |
| The `--config-file=""` flag disables automatic config file loading switching into a flags only mode. |
| {% endhint %} | ||
|
|
||
| {% hint style="info" %} | ||
| The `dummy-blockchain` runs as a subprocess of the Reader component. The Reader manages its lifecycle and extracts block data from it. See [Reader Component](../architecture/components/reader.md) for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe quickly that extracted data is exchanged through stdout pipe to the Reader component and contains chain's specific Protobuf block and metadata.
| ```bash | ||
| # List Substreams tier1 services | ||
| grpcurl -plaintext localhost:10016 list | ||
|
|
||
| # List Substreams tier2 services | ||
| grpcurl -plaintext localhost:10017 list | ||
|
|
||
| # Test a simple Substreams request (if you have a .spkg file) | ||
| # substreams run -e localhost:10016 your-substream.spkg map_blocks -s 1 -t 10 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with working substreams run -e localhost:10016 -p common@v0.1.0 -s 1 -t +5
| {% hint style="info" %} | ||
| Substreams runs on separate ports from Firehose: | ||
| - **Substreams Tier1**: `:10016` (processing tier) | ||
| - **Substreams Tier2**: `:10017` (caching tier) | ||
| - **Firehose**: `:10015` (block streaming) | ||
| {% endhint %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems mostly useless, remove.
| By default, all data is stored under `./firehose-data/storage/`: | ||
|
|
||
| - **One-blocks**: `./firehose-data/storage/one-blocks/` | ||
| - **Merged blocks**: `./firehose-data/storage/merged-blocks/` | ||
| - **Indexes**: `./firehose-data/storage/indexes/` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing slashes.
Also document which flag controls which paths and how they are common and shared among app. You can use docker run --rm -it ghcr.io/streamingfast/firehose-core:v1.10.1 start --help to learn about firecore flags.
| ### Performance Tuning | ||
|
|
||
| For better performance, consider: | ||
|
|
||
| ```bash | ||
| # Increase block rate for faster testing | ||
| --reader-node-arguments="start --tracer=firehose --store-dir=./firehose-data/reader-node --block-rate=300 --genesis-height=0 --genesis-block-burst=100" | ||
|
|
||
| # Use different data directory on faster storage | ||
| --data-dir="/fast-ssd/firehose-data" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
| ## Monitoring | ||
|
|
||
| Monitor your deployment by watching the logs and checking component health: | ||
|
|
||
| ```bash | ||
| # Watch for errors in logs | ||
| tail -f firecore.log | grep ERROR | ||
|
|
||
| # Check disk usage | ||
| du -sh ./firehose-data/ | ||
|
|
||
| # Monitor block processing rate | ||
| watch 'ls ./firehose-data/storage/one-blocks/ | wc -l' | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
- Change working directory from firehose-deployment to firehose-workspace
- Remove manual data directory creation (automatically handled by firecore)
- Update reader-node-arguments to use {data-dir} placeholder
- Fix config-file flag explanation (flags only mode)
- Add stdout pipe and Protobuf data exchange explanation
- Fix relayer stream description (stops after 3 blocks)
- Replace gRPC commands with firecore tools:
* firecore tools firehose-client
* firecore tools firehose-single-block-client
* Add JSON output options (-o protojson, -o json)
- Update Substreams test to use working command: common@v0.1.0
- Remove redundant Substreams port info section
- Update storage paths documentation:
* Remove trailing slashes
* Add flag documentation for path control
* Document shared storage among components
- Remove Performance Tuning section
- Remove Monitoring section
- Add protocol documentation for all ports with Protobuf service links
| ### Performance Tuning | ||
| - **One-blocks**: `./firehose-data/storage/one-blocks` (controlled by `--common-one-block-store-url`) | ||
| - **Merged blocks**: `./firehose-data/storage/merged-blocks` (controlled by `--common-merged-blocks-store-url`) | ||
| - **Indexes**: `./firehose-data/storage/indexes` (controlled by `--common-index-store-url`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only relevant to fireeth and firenear, let's not document this here.
Got it! I'll review your documentation comments... 📝
💻 View my work • 🛑 Stop
| - **Relayer**: `:10014` (gRPC - live block streaming) | ||
| - **Merger**: `:10012` (gRPC - internal merger protocol) | ||
| - **Substreams Tier1**: `:10016` (gRPC - [sf.substreams.rpc.v2.Stream](https://buf.build/streamingfast/substreams/docs/main:sf.substreams.rpc.v2)) | ||
| - **Substreams Tier2**: `:10017` (gRPC - [sf.substreams.rpc.v2.Stream](https://buf.build/streamingfast/substreams/docs/main:sf.substreams.rpc.v2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's gRPC but it's an internal tier1 <=> tier2 protocol
Got it! I'll review your documentation comments... 📝
💻 View my work • 🛑 Stop
…ption - Remove indexes storage documentation (only relevant to fireeth and firenear) - Fix Substreams Tier2 description: internal tier1 <=> tier2 protocol (not public API)
Fully Revamp Firehose Documentation Structure
This PR addresses BLO-537 by completely restructuring the Firehose documentation to clearly separate chain-agnostic content (90%) from chain-specific implementations (10%).
🎯 Key Changes
📋 New Documentation Structure
🔧 CLI-First Approach
firecoreCLI reference with all flags and commands🚀 Operator-Focused Content
📚 Improved Organization
📁 New File Structure
🎯 Target Audience Alignment
This restructure specifically targets:
🔍 Content Highlights
CLI Reference (
core/cli-reference.md)firecorecommand documentationDeployment Guide (
core/deployment-guide.md)Supported Chains (
chains/supported-chains.md)firecorevsfireeth)Quick Start (
getting-started/quick-start.md)🔄 Migration Strategy
🚀 Next Steps
This PR establishes the new structure with:
Follow-up work needed:
📋 Addresses BLO-537 Requirements
This provides a solid foundation for the fully revamped Firehose documentation that better serves network operators and maintains the clear architectural separation requested.
💻 View my work • 👤 Initiated by
Matthieu Vachon• About Codegen