|
| 1 | +--- |
| 2 | +sidebarTitle: Sync Troubleshooting |
| 3 | +title: Detailed Sync Troubleshooting |
| 4 | +--- |
| 5 | + |
| 6 | +This guide provides detailed solutions for common Base node synchronization issues based on community reports (GitHub issues #127, #251, #369, #413, #419, #433). |
| 7 | + |
| 8 | +## Quick Diagnostic Commands |
| 9 | + |
| 10 | +```bash |
| 11 | +# Check sync status |
| 12 | +curl -s http://localhost:7545 | jq '.' |
| 13 | + |
| 14 | +# Check current block |
| 15 | +curl -X POST -H "Content-Type: application/json" \ |
| 16 | + --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \ |
| 17 | + http://localhost:8545 |
| 18 | + |
| 19 | +# Check peer count |
| 20 | +curl -X POST -H "Content-Type: application/json" \ |
| 21 | + --data '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' \ |
| 22 | + http://localhost:8545 |
| 23 | +``` |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## Detailed Sync Scenarios |
| 28 | + |
| 29 | +### Node Consistently Behind (12+ Hours) |
| 30 | + |
| 31 | +- **Issue**: Node falls further behind over time, gap keeps growing. |
| 32 | + - **Check**: L1 RPC rate limiting: |
| 33 | + ```bash |
| 34 | + docker compose logs node | grep -i "rate limit\|429" |
| 35 | + ``` |
| 36 | + - **Check**: Measure lag: |
| 37 | + ```bash |
| 38 | + curl -s http://localhost:7545 | jq '{lag_hours: ((.head_l1.timestamp - .current_l1.timestamp) / 3600)}' |
| 39 | + ``` |
| 40 | + - **Root Cause**: L1 RPC endpoint has insufficient throughput or rate limiting. |
| 41 | + - **Action**: Upgrade L1 RPC provider: |
| 42 | + - Free tier (Infura/Alchemy) insufficient for Base nodes |
| 43 | + - Recommended: Alchemy Growth (~$199/mo), QuickNode (~$49/mo), or self-hosted L1 node |
| 44 | + - Update `OP_NODE_L1_ETH_RPC` and `OP_NODE_L1_BEACON` in `.env.mainnet` |
| 45 | + - Restart: `docker compose down && docker compose up -d` |
| 46 | + - **Verify**: Monitor improvement: |
| 47 | + ```bash |
| 48 | + watch -n 10 'curl -s http://localhost:7545 | jq ".current_l1.number, .head_l1.number"' |
| 49 | + ``` |
| 50 | + |
| 51 | +### Node Completely Stuck (No Progress) |
| 52 | + |
| 53 | +- **Issue**: Block height not increasing for 1+ hours, `eth_syncing` returns `false` but node is behind. |
| 54 | + - **Check**: Block progression: |
| 55 | + ```bash |
| 56 | + # Record current block, wait 60 seconds, check again |
| 57 | + curl -s -X POST -H "Content-Type: application/json" \ |
| 58 | + --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \ |
| 59 | + http://localhost:8545 |
| 60 | + ``` |
| 61 | + - **Check**: P2P connectivity (should be 10+ peers): |
| 62 | + ```bash |
| 63 | + curl -X POST -H "Content-Type: application/json" \ |
| 64 | + --data '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' \ |
| 65 | + http://localhost:8545 |
| 66 | + ``` |
| 67 | + - **Check**: Port 30303 accessibility: |
| 68 | + ```bash |
| 69 | + sudo netstat -tulpn | grep 30303 |
| 70 | + # If not listening, check firewall |
| 71 | + ``` |
| 72 | + - **Root Cause**: Corrupted database, P2P issues, or lost L1/L2 connection. |
| 73 | + - **Action** (try in order): |
| 74 | + 1. Simple restart: `docker compose restart` |
| 75 | + 2. Open P2P port if peer count is 0: |
| 76 | + ```bash |
| 77 | + sudo ufw allow 30303/tcp |
| 78 | + sudo ufw allow 30303/udp |
| 79 | + ``` |
| 80 | + 3. If still stuck, consider snapshot restoration (see [Snapshots](/base-chain/node-operators/snapshots)). |
| 81 | + |
| 82 | +### Extremely Slow Initial Sync |
| 83 | + |
| 84 | +- **Issue**: Syncing at < 100 blocks/second, taking weeks instead of days. |
| 85 | + - **Check**: Storage type: |
| 86 | + ```bash |
| 87 | + lsblk -d -o NAME,ROTA,TYPE,SIZE,MODEL |
| 88 | + # ROTA: 0 = SSD/NVMe (good), 1 = HDD (too slow) |
| 89 | + ``` |
| 90 | + - **Check**: Disk performance: |
| 91 | + ```bash |
| 92 | + sudo hdparm -t /dev/nvme0n1 # should show > 1000 MB/s |
| 93 | + ``` |
| 94 | + - **Check**: RAID configuration (RAID-5/6 causes 10x slowdown): |
| 95 | + ```bash |
| 96 | + cat /proc/mdstat |
| 97 | + ``` |
| 98 | + - **Check**: Disk I/O during sync: |
| 99 | + ```bash |
| 100 | + iostat -x 1 5 |
| 101 | + # %util > 90% and await > 50ms = disk bottleneck |
| 102 | + ``` |
| 103 | + - **Root Cause**: Hardware bottleneck - SATA SSD (3-5x slower), RAID-5/6 (10x penalty), or network-attached storage. |
| 104 | + - **Action**: |
| 105 | + - **Critical**: If using RAID-5/6, migrate to RAID-0, RAID-10, or single NVMe |
| 106 | + - **Critical**: If using network storage (NAS/iSCSI), migrate to local NVMe |
| 107 | + - Consider using snapshot to skip initial sync (see [Snapshots](/base-chain/node-operators/snapshots)) |
| 108 | + - Upgrade to NVMe SSD if using SATA |
| 109 | + |
| 110 | +### Reth-Specific Slow Sync |
| 111 | + |
| 112 | +- **Issue**: Using Reth but sync slower than expected, low resource utilization. |
| 113 | + - **Check**: Current peer count: |
| 114 | + ```bash |
| 115 | + curl -X POST -H "Content-Type: application/json" \ |
| 116 | + --data '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' \ |
| 117 | + http://localhost:8545 |
| 118 | + # Should be 30-100 for fast sync |
| 119 | + ``` |
| 120 | + - **Root Cause**: Reth not configured with performance flags. |
| 121 | + - **Action**: Add performance flags to `.env.mainnet`: |
| 122 | + ```bash |
| 123 | + # Edit .env.mainnet |
| 124 | + ADDITIONAL_ARGS=--full --max-outbound-peers=100 --max-inbound-peers=30 |
| 125 | + |
| 126 | + # For systems with 32GB+ RAM, also add: |
| 127 | + # ADDITIONAL_ARGS=--full --max-outbound-peers=100 --max-inbound-peers=30 --max-cache-size=16384 |
| 128 | + ``` |
| 129 | + - **Action**: Restart to apply changes: |
| 130 | + ```bash |
| 131 | + docker compose down |
| 132 | + docker compose up -d |
| 133 | + ``` |
| 134 | + - **Verify**: Check flags were applied: |
| 135 | + ```bash |
| 136 | + docker compose logs execution | grep "Starting reth" |
| 137 | + ``` |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +## Hardware Anti-Patterns |
| 142 | + |
| 143 | +### Storage Configurations to Avoid |
| 144 | + |
| 145 | +- **RAID-5 / RAID-6**: Causes 10x write penalty due to parity calculations. Migrate to RAID-0, RAID-10, or single NVMe. |
| 146 | + - Check: `cat /proc/mdstat` |
| 147 | + |
| 148 | +- **Network-Attached Storage (NAS/iSCSI)**: Network latency kills sync performance. Use local NVMe only. |
| 149 | + - Check: `df -h | grep reth-data` |
| 150 | + |
| 151 | +- **SATA SSD**: 3-5x slower than NVMe. Acceptable for testing, not for production. |
| 152 | + - Test speed: `sudo hdparm -t /dev/sda` (should be > 500 MB/s for SATA, > 2000 MB/s for NVMe) |
| 153 | + |
| 154 | +### Recommended Configuration |
| 155 | + |
| 156 | +- **Storage**: Local NVMe SSD (PCIe Gen3/4) |
| 157 | +- **RAM**: 32GB+ for Reth with large cache |
| 158 | +- **CPU**: 8+ cores recommended |
| 159 | +- **L1 RPC**: Paid tier or self-hosted (free tiers insufficient) |
| 160 | + |
| 161 | +--- |
| 162 | + |
| 163 | +## Monitoring Commands |
| 164 | + |
| 165 | +```bash |
| 166 | +# Calculate blocks synced per minute |
| 167 | +BLOCK1=$(curl -s -X POST -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' http://localhost:8545 | jq -r '.result' | xargs printf "%d"); sleep 60; BLOCK2=$(curl -s -X POST -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' http://localhost:8545 | jq -r '.result' | xargs printf "%d"); echo "Blocks/min: $(($BLOCK2 - $BLOCK1))" |
| 168 | +
|
| 169 | +# Check hours behind |
| 170 | +curl -s http://localhost:7545 | jq '((.head_l1.timestamp - .current_l1.timestamp) / 3600)' |
| 171 | +
|
| 172 | +# Watch sync progress |
| 173 | +watch -n 5 'curl -s -X POST -H "Content-Type: application/json" --data "{\"jsonrpc\":\"2.0\",\"method\":\"eth_blockNumber\",\"params\":[],\"id\":1}" http://localhost:8545 | jq -r ".result" | xargs printf "%d\n"' |
| 174 | +
|
| 175 | +# Container resources |
| 176 | +docker stats --no-stream |
| 177 | +
|
| 178 | +# Recent errors |
| 179 | +docker compose logs --since 1h | grep -i error | tail -20 |
| 180 | +``` |
| 181 | + |
| 182 | +--- |
| 183 | + |
| 184 | +## Quick Reference |
| 185 | + |
| 186 | +| Task | Command | |
| 187 | +|------|---------| |
| 188 | +| Check sync status | `curl -s http://localhost:7545 \| jq '.'` | |
| 189 | +| Current block | `curl -X POST ... eth_blockNumber ...` | |
| 190 | +| Peer count | `curl -X POST ... net_peerCount ...` | |
| 191 | +| Exec logs | `docker compose logs -f execution` | |
| 192 | +| Node logs | `docker compose logs -f node` | |
| 193 | +| Restart | `docker compose restart` | |
| 194 | +| Disk I/O | `iostat -x 1 5` | |
| 195 | +| RAID config | `cat /proc/mdstat` | |
| 196 | +| Disk speed | `sudo hdparm -t /dev/nvme0n1` | |
| 197 | + |
| 198 | +--- |
| 199 | + |
| 200 | +## Related Issues |
| 201 | + |
| 202 | +This guide addresses issues reported in: |
| 203 | +- [#127](https://github.com/base-org/node/issues/127) - Node 12+ hours behind |
| 204 | +- [#251](https://github.com/base-org/node/issues/251) - Intermittent slow sync |
| 205 | +- [#369](https://github.com/base-org/node/issues/369) - RAID-5 performance issues |
| 206 | +- [#413](https://github.com/base-org/node/issues/413) - op-reth slow sync |
| 207 | +- [#419](https://github.com/base-org/node/issues/419) - Node stuck/unsynced |
| 208 | +- [#433](https://github.com/base-org/node/issues/433) - Snapshot issues |
| 209 | + |
| 210 | +For general troubleshooting, see [Node Troubleshooting](/base-chain/node-operators/troubleshooting). |
0 commit comments