tnch
medium
Any attacker with remote eth_call
access to an l2geth
node may inject malicious data into the state dump file used as seed for the upcoming Bedrock migration. Sufficiently large or ill-crafted payloads may hinder, or completely halt, the migration process.
In l2geth/core/vm/evm.go
, the Call
function has been modified to log transaction data to a dump file. Actual writes to disk of such logs only occur when the "state dumper" is activated. That is, when the node is run setting a file path in the L2GETH_STATE_DUMP_PATH
environment variable.
For a quick reference, here's a snippet of the Call
function where the state dumper is executed:
func (evm *EVM) Call(caller ContractRef, addr common.Address, input []byte, gas uint64, value *big.Int) (ret []byte, leftOverGas uint64, err error) {
if addr == dump.MessagePasserAddress {
statedumper.WriteMessage(caller.Address(), input)
}
if addr == dump.OvmEthAddress {
// We need at least 4 bytes + 32 bytes for the recipient address, then
// address will be found at bytes 16-36. 0x40c10f19 is the function
// selector for mint(address,uint256).
if len(input) >= 36 && bytes.Equal(input[:4], mintSigHash) {
recipient := common.BytesToAddress(input[16:36])
statedumper.WriteETH(recipient)
}
}
Following the above snippet, there are at least two scenarios where the dumper dumps to disk. Either when the target address of the call matches the address of the Message Passer predeploy (0x4200000000000000000000000000000000000000
), or when it matches the address of the OVM ETH predeploy (0xDeadDeAddeAddEAddeadDEaDDEAdDeaDDeAD0000
). From my understanding, the code's intention is to continuously log transactions that may include L2->L1 withdrawals, or that may change OVM ETH balance. The dumper logs this information in an append-only fashion to a single file. It follows the format MSG|<from>|<data>
for transactions that target the Message Passer predeploy. And the ETH|<from>
format for the ones that target the OVM ETH predeploy.
The resulting file is used as a starting point for the upcoming migration to Bedrock. It's used in multiple places outside l2geth
. For starters, it seems to first be ingested by the command parse-state-dump
defined in migration-data/bin/cli.ts
, parsing it to JSON files. Then the resulting JSON-formatted data is handled across multiple functions within op-chain-ops
. I'll go over the specifics of these later.
The problem is that, given the way Call
has been modified, data is not only written to the dump file during actual L2 state-changing transactions. Because l2geth
also executes Call
during any eth_call
RPC call. This kind of calls are usually free and open in public-facing nodes.
As a result, by running eth_call
s RPC calls against l2geth
nodes that have the state dumper active, anyone can pollute the state dump file injecting fake L2 transaction data.
You can quickly reproduce the behavior in a local dev node:
- Start a local dev node with
docker run --env L2GETH_STATE_DUMP_PATH="/usr/local/gethdump" -it ethereumoptimism/l2geth --dev --rpc
- Inside the container, check the
/usr/local/gethdump
file starts empty (withcat /usr/local/gethdump
). - Inside the container, run
geth attach /tmp/geth.ipc
to spin up a Javascript console. - In the Javascript console, run
web3.eth.call({from:"0x0000000000000000000000000000000000000123", to:"0x4200000000000000000000000000000000000000", data: "0xaabb"})
to insert oneMSG
log in the dump file. - Running
cat /usr/local/gethdump
now shows:
MSG|0x0000000000000000000000000000000000000123|aabb
- In the Javascript console, run
web3.eth.call({from:"0x0000000000000000000000000000000000000123", to:"0xDeadDeAddeAddEAddeadDEaDDEAdDeaDDeAD0000", data: "0x40c10f1900000000000000000000000000000000000000000000000000000000000001230000000000000000000000000000000000000000000000000000000000000000"})
to insert oneETH
log in the dump file. In this case, thedata
field must be an ABI-encoded payload formint(address,uint256)
. - Running
cat /usr/local/gethdump
now shows
MSG|0x0000000000000000000000000000000000000123|aabb
ETH|0x0000000000000000000000000000000000000123
Thus showing that anyone can write arbitrary data on the state dump file of l2geth
nodes.
An eth_call
in l2geth
does not appear to restrict the payload size of the data
field. Opposite to what the node would do with real L2 transactions. Hence, it's not only possible to inject fake data in the dump file, but actually large fake data.
For example, a call like web3.eth.call({from:"0x0000000000000000000000000000000000000000", to:"0x4200000000000000000000000000000000000000", data: "0x" + Array(1000000).join("00")})
would append a single MSG
record with +1 million bytes to the dump file. In principle there's no limit implemented on l2geth
on the amount of eth_calls
that anyone can execute. Even with such large (or even larger) payloads. Should the payload's size be limited, an attacker could still make a huge amount of eth_calls
and log arbitrary data as long as it fits within the limit.
As a first consequence, the attack could quickly increase the dump file's size with garbage data. Thus making it more difficult and slower to parse by the migration scripts that rely on its availability. On top of this, sending these IO-heavy eth_calls
will likely highly stress the node's resources due to ingests of big pieces of data that are sequentially written to disk in the dump file. I haven't tested for an actual DoS of production L2 nodes using this attack vector. At least in my resource-constrained local environment, I did experience longer delays in the responses the larger the payloads in eth_call
s.
Moreover, regardless of size, the attacker is in control of the file's contents. By crafting specific payloads in the data, there are multiple steps in which an attacker can impede a successful migration. I'll now point to these specific steps, avoiding to explain in detail the whole migration flow and its function calls. For brevity, and because I trust you're already more than familiar with it.
First, the migration-data/bin/cli.ts
file attempts to parse the third colum of MSG
records using the decodeFunctionData
of ethers
. Yet decoding failures are not handled. Therefore, any injected MSG
record that does not follow the ABI encoding of passMessageToL1(bytes)
will throw and halt this script.
Second, let's assume the migration-data/bin/cli.ts
was able to run correctly. Then all withdrawal records in the migration data must contain ABI-encoded bytes for passMessageToL1(bytes)
following this structure:
0xcafa81dc
0000000000000000000000000000000000000000000000000000000000000020 | offset
0000000000000000000000000000000000000000000000000000000000000001 | length
0100000000000000000000000000000000000000000000000000000000000000 | bytes
In theory the last bytes should correspond to a serialized legacy withdrawal. That's why the ToLegacyWithdrawal
function defined in op-chain-ops/genesis/migration/types.go
expects it to be correctly ABI-encoded. As seen in the checks it performs when decoding the serialized bytes into a LegacyWithdrawal
.
But remember that the attacker has complete control over the original MSG
records and its corresponding serialized data. Therefore, at this point of the process, an attack can make the migration fail in a number of ways: (i) injecting payloads that don't have at least 4 bytes, (ii) injecting payloads whose first four bytes do not match 0xcbd4ece9
(the ID of relayMessage(address,address,bytes,uint256)
), (iii) injecting payloads whose corresponding from
field logged in the dump file does not match 0x4200000000000000000000000000000000000007
(the address of the L2CrossDomainMessageSender
predeploy), or (iv) injecting payloads that cannot be decoded into the expected address,address,bytes,uint256
parameter types.
As far as I understand an attacker can only attempt to interfere with the migration process, either by slowing it down or halting it. I haven't been able to extend to impact to crediting more ETH balance than what is due, or store repeated / illegitimate withdrawals in the genesis state of Bedrock.
I cannot tell how many l2geth nodes are exposed to the Internet with the L2GETH_STATE_DUMP_PATH
environment flag activated. The flag is not active by default, which is a good sign. Still I assume that some nodes must be running with it, because it's needed for the upcoming migration. Cannot confirm whether they're freely reachable or not. At first sight it shouldn't be straightforward for an attacker to say which nodes are running with the flag turned on. Although I wonder whether they could be fingerprinted by the amount of time they take to process eth_call
s.
Also, as far as I've researched, the vulnerability is not present in op-geth
. I understand op-geth
is in scope, while l2geth
is not. Still seemed appropriate to disclose here, since this issue could potentially affect the migration process.
Manual review + local dev net
Modify l2geth
so that it only writes to the dump file when an actual L2 transaction is executed, and not during eth_call
s. If it's too late to be modifying l2geth
, and there are node providers with nodes exposed to the internet with this flag active, I'd recommend them to strongly rate-limit on the exposed infra any RPC calls that look like an eth_call
.