Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: follower node sync from DA rebased to syncUpstream/active #1013

Merged
merged 12 commits into from
Oct 16, 2024

Conversation

jonastheis
Copy link

@jonastheis jonastheis commented Aug 29, 2024

1. Purpose or design rationale of this PR

This PR (originally implemented in #631, but moved here to syncUpstream/active branch) implements a follower node from DA/L1 mode which reproduces the L2 state solely from L1 events and loading data from DA (calldata is retrieved from L1 RPC directly, historical blobs are loaded via beacon node or blob APIs and verified via versionedHash).

On a high level, it works as follows: the L2 functionality of the node is disabled and instead it connects only to the configured L1 RPC, beacon node or blob APIs, retrieves all rollup events (commit batch, revert, finalize), L1 messages and batch data (i.e. calldata or blobs since Bernoulli). Once an event is finalized on L1 the resulting state (meaning L2 state and blocks) are derived and verified from this data.

The derivation process works by implementing a pipeline with following steps:
- DAQueue: uses DataSource to retrieve events and corresponding data (calldata or blob).
- BatchQueue: sorts different DATypes and returns committed, finalized batches in order.
- BlockQueue: converts batches to PartialBlocks that can be used to create the L2 state
- DASyncer: executes PartialBlock and inserts into chain

How to run?

Run l2geth with the --da.sync flag. Provide blob APIs and beacon node with

  • --da.blob.beaconnode "<L1 beacon node>" (recommended, if beacon node supports historical blobs)
  • --da.blob.blobscan "https://api.blobscan.com/blobs/" --da.blob.blocknative "https://api.ethernow.xyz/v1/blob/" for mainnet
  • --da.blob.blobscan "https://api.sepolia.blobscan.com/blobs/" for Sepolia.

Strictly speaking only one of the blob providers is necessary, but during testing blobscan and blocknative were not fully reliable. That's why using a beacon node with historical blob data is recommended (can be additionally to blobscan and blobnative). The pipeline rotates the blob providers and retries if one of them fails.

mainnet

# build geth without circuit capacity checker (not needed for L1 follower node)
make nccc_geth

./build/bin/geth --scroll \
--datadir "tmp/mainnet-l2geth-datadir" \
--gcmode archive \
--http --http.addr "0.0.0.0" --http.port 8545 --http.api "eth,net,web3,debug,scroll" \
--da.sync=true \
--da.blob.blobscan "https://api.blobscan.com/blobs/" --da.blob.blocknative "https://api.ethernow.xyz/v1/blob/" \
--da.blob.beaconnode "<L1 beacon node>" \
--l1.endpoint "<L1 RPC node>" \
--verbosity 3

A full sync will take about 2 weeks depending on the speed of the RPC node, beacon node and the local machine. Progess is reported as follows for every 1000 blocks applied:

INFO [08-01|16:44:42.173] L1 sync progress                         blockhain height=87000 block hash=608eec..880ebd root=218215..9a58a2

Sepolia

# build geth without circuit capacity checker (not needed for L1 follower node)
make nccc_geth

./build/bin/geth --scroll-sepolia \
--datadir "tmp/sepolia-l2geth-datadir" \
--gcmode archive \
--http --http.addr "0.0.0.0" --http.port 8545 --http.api "eth,net,web3,debug,scroll" \
--da.sync=true \
--da.blob.blobscan "https://api.sepolia.blobscan.com/blobs/" \
--da.blob.beaconnode "<L1 beacon node>" \
--l1.endpoint "<L1 RPC node>" \
--verbosity 3

A full sync will take about 2-3 days depending on the speed of the RPC node, beacon node and the local machine. Progess is reported as follows for every 1000 blocks applied:

INFO [08-01|16:44:42.173] L1 sync progress                         blockhain height=87000 block hash=608eec..880ebd root=218215..9a58a2

Troubleshooting

You should see something like this shortly after starting:

  • the node (APIs, geth console, etc) will not be responsive until all the L1 messages have been synced
  • but it is already starting the derivation pipeline which can be seen through L1 sync progress [...].
  • for Sepolia it might take a little longer (10-20mins) for the first L1 sync progress [...] to appear as the L1 blocks are more sparse at the beginning
INFO [09-18|13:41:34.039] Starting L1 message sync service         latestProcessedBlock=20,633,529
WARN [09-18|13:41:34.551] Running initial sync of L1 messages before starting l2geth, this might take a while... 
INFO [09-18|13:41:45.249] Syncing L1 messages                      processed=20,634,929 confirmed=20,777,179 collected=71 progress(%)=99.315
INFO [09-18|13:41:55.300] Syncing L1 messages                      processed=20,637,029 confirmed=20,777,179 collected=145 progress(%)=99.325
INFO [09-18|13:42:05.400] Syncing L1 messages                      processed=20,638,329 confirmed=20,777,179 collected=220 progress(%)=99.332
INFO [09-18|13:42:15.610] Syncing L1 messages                      processed=20,640,129 confirmed=20,777,179 collected=303 progress(%)=99.340
INFO [09-18|13:42:24.324] L1 sync progress                         "blockhain height"=1000 "block hash"=a28c48..769cee root=174edb..9d9fbd
INFO [09-18|13:42:25.555] Syncing L1 messages                      processed=20,641,529 confirmed=20,777,179 collected=402 progress(%)=99.347

Temporary errors
Especially at the beginning some errors like below might appear in the console. This is expected, as the pipeline relies on the L1 messages but in case they're not synced fast enough such an error might pop up. It will continue once the L1 messages are available.

WARN [09-18|13:52:25.843] syncing pipeline step failed due to temporary error, retrying err="temporary: failed to process logs to DA, error: failed to get commit batch da: 7, err: failed to get L1 messages for v0 batch 7: EOF: <nil>"

Limitations

The state root of a block can be reproduced when using this mode of syncing but currently not the block hash. This is due to the fact that currently the header fields difficulty and extraData are not stored on DA but these fields are utilized by Clique consensus which is used by the Scroll protocol. This will be fixed in a future upgrade where the main implementation on l2geth is already done: #903 #913.

To verify the locally created state root against mainnet (https://sepolia-rpc.scroll.io/ for Sepolia), we can do the following:

# query local block info
curl localhost:8545 -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_getHeaderByNumber","params":["0x2AF8"],"id":0}' | jq

# query mainnet block info
curl https://rpc.scroll.io -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_getHeaderByNumber","params":["0x2AF8"],"id":0}' | jq

By comparing the headers we can most importantly see that state root , receiptsRoot and everything that has to do with the state matches. However, the following fields will be different:

  • difficulty and therefore totalDifficulty
  • extraData
  • size due to differences in header size
  • hash and therefore parentHash

Example local output for block 11000:

{
  "jsonrpc": "2.0",
  "id": 0,
  "result": {
    "difficulty": "0xa",
    "extraData": "0x0102030405060708",
    "gasLimit": "0x989680",
    "gasUsed": "0xa410",
    "hash": "0xf3cdafbe35d5e7c18d8274bddad9dd12c94b83a81cefeb82ebb73fa799ff9fcc",
    "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
    "miner": "0x0000000000000000000000000000000000000000",
    "mixHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "nonce": "0x0000000000000000",
    "number": "0x2af8",
    "parentHash": "0xde244f7e8bc54c8809e6c2ce65c439b58e90baf11f6cf9aaf8df33a827bd01ab",
    "receiptsRoot": "0xd95b673818fa493deec414e01e610d97ee287c9421c8eff4102b1647c1a184e4",
    "sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
    "size": "0x252",
    "stateRoot": "0x0f387e78e4a7457a318c7bce7cde0b05c3609347190144a7e105ef05194ae218",
    "timestamp": "0x6526db8e",
    "totalDifficulty": "0x1adb1",
    "transactionsRoot": "0x6a81c9342456693d57963883983bba024916f4d277392c9c1dc497e3518a78e3"
  }
}

Example remote output:

{
  "id": 0,
  "jsonrpc": "2.0",
  "result": {
    "difficulty": "0x2",
    "extraData": "0xd883050000846765746888676f312e31392e31856c696e7578000000000000009920319c246ec8ae4d4f73f07d79f68b2890e9c2033966efe5a81aedddae12875c3170f0552f48b7e5d8e92ac828a6008b2ba7c5b9c4a0af1692337bbdc792be01",
    "gasLimit": "0x989680",
    "gasUsed": "0xa410",
    "hash": "0xb7848d5b300247d7c33aeba0f1b33375e1cb3113b950dffc140945e9d3d88d58",
    "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
    "miner": "0x0000000000000000000000000000000000000000",
    "mixHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
    "nonce": "0x0000000000000000",
    "number": "0x2af8",
    "parentHash": "0xa93e6143ab213a044eb834cdd391a6ef2c818de25b04a3839ee44a75bd28a2c7",
    "receiptsRoot": "0xd95b673818fa493deec414e01e610d97ee287c9421c8eff4102b1647c1a184e4",
    "sha3Uncles": "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
    "size": "0x2ab",
    "stateRoot": "0x0f387e78e4a7457a318c7bce7cde0b05c3609347190144a7e105ef05194ae218",
    "timestamp": "0x6526db8e",
    "totalDifficulty": "0x55f1",
    "transactionsRoot": "0x6a81c9342456693d57963883983bba024916f4d277392c9c1dc497e3518a78e3"
  }
}

2. PR title

Your PR title must follow conventional commits (as we are doing squash merge for each PR), so it must start with one of the following types:

  • build: Changes that affect the build system or external dependencies (example scopes: yarn, eslint, typescript)
  • ci: Changes to our CI configuration files and scripts (example scopes: vercel, github, cypress)
  • docs: Documentation-only changes
  • feat: A new feature
  • fix: A bug fix
  • perf: A code change that improves performance
  • refactor: A code change that doesn't fix a bug, or add a feature, or improves performance
  • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
  • test: Adding missing tests or correcting existing tests

3. Deployment tag versioning

Has the version in params/version.go been updated?

  • This PR doesn't involve a new deployment, git tag, docker image tag, and it doesn't affect traces
  • Yes

4. Breaking change label

Does this PR have the breaking-change label?

  • This PR is not a breaking change
  • Yes

 Conflicts:
	cmd/geth/main.go
	core/state_processor_test.go
	core/txpool/legacypool/legacypool.go
	eth/backend.go
	eth/ethconfig/config.go
	eth/gasprice/gasprice_test.go
	eth/handler.go
	eth/protocols/eth/broadcast.go
	eth/protocols/eth/handlers.go
	go.mod
	go.sum
	miner/miner.go
	miner/miner_test.go
	miner/scroll_worker.go
	miner/scroll_worker_test.go
	params/config.go
	params/version.go
	rollup/rollup_sync_service/rollup_sync_service_test.go
Copy link

semgrep-app bot commented Aug 29, 2024

Semgrep found 6 ssc-46663897-ab0c-04dc-126b-07fe2ce42fb2 findings:

Risk: Affected versions of golang.org/x/net, golang.org/x/net/http2, and net/http are vulnerable to Uncontrolled Resource Consumption. An attacker may cause an HTTP/2 endpoint to read arbitrary amounts of header data by sending an excessive number of CONTINUATION frames.

Fix: Upgrade this library to at least version 0.23.0 at go-ethereum/go.mod:144.

Reference(s): GHSA-4v7x-pqxf-cx7m, CVE-2023-45288

Ignore this finding from ssc-46663897-ab0c-04dc-126b-07fe2ce42fb2.

@0xmountaintop
Copy link
Member

we can upgrade the da-codec to 41c6486 now

@jonastheis jonastheis mentioned this pull request Oct 14, 2024
13 tasks
…ync-directly-from-da-rebased

Conflicts:
	eth/backend.go
	go.mod
	go.sum
	miner/scroll_worker.go
	rollup/rollup_sync_service/rollup_sync_service.go
@jonastheis jonastheis marked this pull request as ready for review October 14, 2024 09:26
@0xmountaintop
Copy link
Member

image

@jonastheis jonastheis merged commit 8454491 into syncUpstream/active Oct 16, 2024
3 of 4 checks passed
@jonastheis jonastheis deleted the feat/sync-directly-from-da-rebased branch October 16, 2024 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants