`Unexpected trie node` error occurs after initial snap sync

#### System information

Geth version: `geth version`: `1.13.5`


#### Issue description

Ref: original ticket https://github.com/ethereum/go-ethereum/issues/27983#issuecomment-1818965551

```
Nov 22 12:33:19 ip-10-0-0-11.ec2.internal geth[30414]: INFO [11-22|12:33:19.850] Initialized transaction indexer          limit=2,350,000
Nov 22 12:33:19 ip-10-0-0-11.ec2.internal geth[30414]: INFO [11-22|12:33:19.850] Loaded local transaction journal         transactions=0 dropped=0
Nov 22 12:33:19 ip-10-0-0-11.ec2.internal geth[30414]: INFO [11-22|12:33:19.851] Regenerated local transaction journal    transactions=0 accounts=0
Nov 22 12:33:20 ip-10-0-0-11.ec2.internal geth[30414]: WARN [11-22|12:33:20.370] Switch sync mode from snap sync to full sync reason="snap sync complete"
Nov 22 12:33:20 ip-10-0-0-11.ec2.internal geth[30414]: INFO [11-22|12:33:20.370] Chain post-merge, sync via beacon client
Nov 22 12:33:20 ip-10-0-0-11.ec2.internal geth[30414]: INFO [11-22|12:33:20.370] Gasprice oracle is ignoring threshold set threshold=2
Nov 22 12:33:20 ip-10-0-0-11.ec2.internal geth[30414]: ERROR[11-22|12:33:20.389] Unexpected trie node in disk             owner=5cc0a4..667982 path="[12 5 9 3 7]" expect=8b09b1..e87152 got=99f9a0..b9f78f
Nov 22 12:33:20 ip-10-0-0-11.ec2.internal geth[30414]: ERROR[11-22|12:33:20.389] State snapshotter failed to iterate trie err="missing trie node 8b09b17b3a4e17de5274c52cc6387cf42c1fb25fd97effda757bb9a2cde87152 (owner 5cc0a47442e6bc69eb1ec9e2ff1fe0c9657c26dfa5836f560fd7141038667982) (path 0c05090307) unexpected node, loc: disk, node: (5cc0a47442e6bc69eb1ec9e2ff1fe0c9657c26dfa5836f560fd7141038667982 [12 5 9 3 7]), 8b09b17b3a4e17de5274c52cc6387cf42c1fb25fd97effda757bb9a2cde87152!=99f9a0c9f954cd0d8cf5bb7df9c2b5e529a1652fcc97824ee446ba9300b9f78f, blob: 0xf87180a0df5465feffb831b1f31a6184b1efdf75f10f13b2b4900956c22f41a6108c45c9808080808080a0b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae808080a02ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e780808080"
```

The node is reported as invalid, with 
- owner: `5cc0a47442e6bc69eb1ec9e2ff1fe0c9657c26dfa5836f560fd7141038667982`, 
- address: `0x32400084C286CF3E17e7B677ea9583e60a000324`
- path:`[12 5 9 3 7]`
- content: `0xf87180a0df5465feffb831b1f31a6184b1efdf75f10f13b2b4900956c22f41a6108c45c9808080808080a0b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae808080a02ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e780808080`
- exphash: `8b09b17b3a4e17de5274c52cc6387cf42c1fb25fd97effda757bb9a2cde87152`
- gothash `99f9a0c9f954cd0d8cf5bb7df9c2b5e529a1652fcc97824ee446ba9300b9f78f`

After retrieving the correct node from our benchmark machine, I rlpdump them

**correct node**

```
(base) ➜  ~ rlpdump -hex 0xf8518080808080808080a0b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae808080a02ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e780808080
[
  "",
  "",
  "",
  "",
  "",
  "",
  "",
  "",
  b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae,
  "",
  "",
  "",
  2ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e7,
  "",
  "",
  "",
  "",
]
```

**corrupted node**

```
(base) ➜  ~ rlpdump -hex 0xf87180a0df5465feffb831b1f31a6184b1efdf75f10f13b2b4900956c22f41a6108c45c9808080808080a0b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae808080a02ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e780808080
[
  "",
  df5465feffb831b1f31a6184b1efdf75f10f13b2b4900956c22f41a6108c45c9,
  "",
  "",
  "",
  "",
  "",
  "",
  b1902b4fca66415f63634e3ddeae1bfa7b877a1db5ed4c029730e166ba2031ae,
  "",
  "",
  "",
  2ded9e78076e79e96fcd5562c7951f678d22a167429cc75c17d30a08705bb6e7,
  "",
  "",
  "",
  "",
]
```

The corrupted node has one more child at the index 1.

---

Also, I dumped out the parent nodes of this one, they are all full nodes with no shortNode in the middle of path, so it's not relevant with the shortNode trick at all.

This storage is quite huge, with 1.8m slots inside.

--- 

I analyzed the contract, there are two functions can mutate the states: 

- `finalizeEthWithdrawal (0x6c0960f9)`: example from [etherscan](https://etherscan.io/tx/0x48c7d8df67c9f69ca1d92cfdcb779728683cd1a7ac4c54a8308f922212a94c20#statechange)
- `requestL2Transaction (0xeb672419)`: example from [etherscan](https://etherscan.io/tx/0x58a8e1c66852219e509fd34b13b1973e1d5bb1a42f83a0b102b9a1b6a67d2dc3#statechange) 

Both of them only create new storage slot, but never delete storage slot. 

--- 

There are a few possibilities here for this situation:
- The state sync target is forked, the transaction which creates the trie node at index 1 is reorged out and never get accepted
   I don't think it's the case here. Geth uses `head-64` as the sync target, which is very very hard to be reorged in the proof-of-stake network.

- programatic problems??


--- 


The log is attached. 

> Here it is -- I had a few mis-starts in the log but each time I purged the DB

[sync-log.zip](https://github.com/ethereum/go-ethereum/files/13446792/sync-log.zip)





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`Unexpected trie node` error occurs after initial snap sync #28587

System information

Issue description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected trie node error occurs after initial snap sync #28587

Description

System information

Issue description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`Unexpected trie node` error occurs after initial snap sync #28587