ci: add daily Nitro-Nethermind compatibility verification workflow#660
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #660 +/- ##
=======================================
Coverage 76.71% 76.71%
=======================================
Files 178 178
Lines 11722 11722
Branches 1559 1559
=======================================
Hits 8992 8992
Misses 2162 2162
Partials 568 568 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds a new GitHub Actions workflow for daily automated compatibility verification between Nitro (consensus layer) and Nethermind (execution layer) implementations. The workflow syncs Arbitrum Sepolia and Mainnet blocks from genesis to detect block hash mismatches between the two implementations, providing early detection of compatibility issues.
Changes:
- Added daily scheduled workflow (3 AM UTC) with manual trigger support for Nitro-Nethermind compatibility testing
- Parallel build and verification architecture using reusable workflows and matrix strategy for Sepolia and Mainnet
- Block synchronization monitoring with automated mismatch detection and comprehensive reporting via GitHub Summary
Comments suppressed due to low confidence (8)
.github/workflows/block-verification.yml:66
- The matrix expression uses nested ternary operators which can be error-prone. When inputs.network is undefined (e.g., in scheduled runs), the expression will evaluate to
'["sepolia", "mainnet"]'. However, the way the ternary is structured, if inputs.network is 'sepolia', it returns '["sepolia"]', if 'mainnet', it returns '["mainnet"]', otherwise it returns '["sepolia", "mainnet"]'. This logic appears correct but consider simplifying for readability. Note that in scheduled runs, inputs.network will be null/undefined, which will correctly default to both networks.
network: ${{ fromJson(inputs.network == 'sepolia' && '["sepolia"]' || inputs.network == 'mainnet' && '["mainnet"]' || '["sepolia", "mainnet"]') }}
.github/workflows/block-verification.yml:272
- The artifact retention for logs is 14 days while results are kept for 90 days. Consider documenting why there's this difference in retention periods. If logs are needed for debugging failures, they might need to be retained for longer. Alternatively, if results are the primary concern, the retention periods could be aligned.
- name: Upload logs
if: always()
uses: actions/upload-artifact@v4
with:
name: verification-logs-${{ matrix.network }}-${{ github.run_id }}
path: /tmp/verification-${{ matrix.network }}/logs/
retention-days: 14
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: verification-results-${{ matrix.network }}-${{ github.run_id }}
path: /tmp/verification-${{ matrix.network }}/results.json
retention-days: 90
.github/workflows/block-verification.yml:225
- The block comparison uses
-ge(greater than or equal) which is correct, but the initial value of CURRENT_BLOCK is 0, and the loop starts with a 10-second sleep before the first check. This means if the node starts quickly and reaches the target within 10 seconds, it won't be detected until the first iteration completes. Consider doing an initial check before entering the loop, or moving the sleep to the end of the loop.
while true; do
sleep 10
# Check if Nitro is still running
if ! kill -0 $NITRO_PID 2>/dev/null; then
echo "ERROR: Nitro process exited unexpectedly"
tail -100 "/tmp/verification-$NETWORK/logs/nitro.log"
exit 1
fi
# Get current block from Nethermind
CURRENT_BLOCK=$(curl -s -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
"http://localhost:$RPC_PORT" 2>/dev/null | grep -oP '(?<="result":"0x)[^"]+' | xargs -I{} printf "%d" 0x{} 2>/dev/null || echo "0")
echo "Current block: $CURRENT_BLOCK / $TARGET_BLOCK"
# Check for verification errors in Nethermind logs (exact pattern from ArbitrumSyncMonitor.cs)
if grep -q "Block hash mismatch for" "/tmp/verification-$NETWORK/logs/nethermind.log" 2>/dev/null; then
echo "ERROR: Block verification mismatch detected!"
grep "Block hash mismatch for" "/tmp/verification-$NETWORK/logs/nethermind.log"
exit 1
fi
# Check if we've reached target
if [ "$CURRENT_BLOCK" -ge "$TARGET_BLOCK" ]; then
echo "SUCCESS: Reached target block $TARGET_BLOCK!"
break
fi
# Check for stall
if [ "$CURRENT_BLOCK" -eq "$LAST_BLOCK" ]; then
STALL_COUNT=$((STALL_COUNT + 1))
if [ "$STALL_COUNT" -ge "$MAX_STALL" ]; then
echo "ERROR: Sync stalled at block $CURRENT_BLOCK for too long"
exit 1
fi
else
STALL_COUNT=0
LAST_BLOCK=$CURRENT_BLOCK
fi
done
.github/workflows/block-verification.yml:256
- When processes are killed in the cleanup step, using
killwithout any signal sends SIGTERM, which is correct. However, for processes that might not respond to SIGTERM immediately, consider adding a timeout and forceful kill. For example:timeout 10 kill PID || kill -9 PIDto ensure processes are terminated even if they hang.
- name: Stop processes
if: always()
run: |
if [ -f "/tmp/verification-$NETWORK/nitro.pid" ]; then
kill "$(cat "/tmp/verification-$NETWORK/nitro.pid")" 2>/dev/null || true
fi
if [ -f "/tmp/verification-$NETWORK/nethermind.pid" ]; then
kill "$(cat "/tmp/verification-$NETWORK/nethermind.pid")" 2>/dev/null || true
fi
.github/workflows/block-verification.yml:197
- The monitoring loop uses
2>/dev/nullto suppress error output, but this makes debugging difficult when RPC calls fail. Consider allowing stderr to be captured so that issues with curl or jq can be diagnosed from the logs. At minimum, log when the RPC call returns an empty or malformed response.
CURRENT_BLOCK=$(curl -s -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
"http://localhost:$RPC_PORT" 2>/dev/null | grep -oP '(?<="result":"0x)[^"]+' | xargs -I{} printf "%d" 0x{} 2>/dev/null || echo "0")
.github/workflows/block-verification.yml:6
- The workflow title uses 'feature(ci):' prefix which typically indicates a feature commit, but this is a workflow file, not a commit message. The name field should describe the workflow's purpose. Consider changing the name from 'Compatibility Verification' to something more descriptive like 'Daily Nitro-Nethermind Block Verification' to match the PR title, or keep it short and clear.
name: Compatibility Verification
.github/workflows/block-verification.yml:310
- The status emoji logic only handles "success" vs everything else as "fail". However, the job.status can also be "cancelled" or "skipped". Consider handling these cases with appropriate emojis (e.g., "⏸️ cancelled", "⏭️ skipped") to provide clearer feedback in the summary.
if [ "$STATUS" = "success" ]; then
STATUS_EMOJI="pass"
else
STATUS_EMOJI="fail"
fi
.github/workflows/block-verification.yml:310
- The STATUS_EMOJI variable contains text like "pass" or "fail", not actual emoji characters. Consider using actual emoji (✅ for pass, ❌ for fail) for better visual clarity in the GitHub summary, or rename the variable to STATUS_TEXT to better reflect its content.
if [ "$STATUS" = "success" ]; then
STATUS_EMOJI="pass"
else
STATUS_EMOJI="fail"
fi
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b857c60 to
1dae7e3
Compare
Summary
Add automated daily Nitro-Nethermind compatibility verification workflow that syncs Arbitrum Sepolia and Mainnet blocks from genesis to detect any block hash mismatches between the two implementations.
Architecture
flowchart TB subgraph trigger["Trigger"] schedule["⏰ Daily 3 AM UTC"] manual["🔘 Manual Dispatch"] end subgraph build["Parallel Build Phase"] direction LR nitro["build-nitro<br/>(_build-nitro.yml)"] nethermind["build-nethermind<br/>(_build-nethermind.yml)"] end subgraph verify["Matrix Verification Phase"] direction TB subgraph sepolia["Sepolia (ports 20545/20551)"] nm_sep["Nethermind EL<br/>--VerifyBlockHash.Enabled=true"] nitro_sep["Nitro CL<br/>chain.id=421614"] nm_sep -->|"Engine API"| nitro_sep end subgraph mainnet["Mainnet (ports 21545/21551)"] nm_main["Nethermind EL<br/>--VerifyBlockHash.Enabled=true"] nitro_main["Nitro CL<br/>chain.id=42161"] nm_main -->|"Engine API"| nitro_main end end subgraph report["Report Phase"] summary["📊 GitHub Summary"] artifacts["📦 Logs & Results"] end trigger --> build nitro --> verify nethermind --> verify sepolia --> report mainnet --> reportFlow