Description
Description
During Fast sync tests on Goerli, I had the same Java Exception 3 times, in 3 different executions. This Exception stops downloading the world state while importing blocks continues its execution. This exception could be related with RocksDB optimistic transactions mode, because in this case commit can fail. In my opinion, we should catch the exception on commit and retry.
org.hyperledger.besu.plugin.services.exception.StorageException: org.rocksdb.RocksDBException: Busy at org.hyperledger.besu.plugin.services.storage.rocksdb.segmented.RocksDBColumnarKeyValueStorage$RocksDbTransaction.commit(RocksDBColumnarKeyValueStorage.java:278) at org.hyperledger.besu.services.kvstore.SegmentedKeyValueStorageTransactionTransitionValidatorDecorator.commit(SegmentedKeyValueStorageTransactionTransitionValidatorDecorator.java:49) at org.hyperledger.besu.services.kvstore.SegmentedKeyValueStorageAdapter$1.commit(SegmentedKeyValueStorageAdapter.java:90) at org.hyperledger.besu.ethereum.storage.keyvalue.WorldStateKeyValueStorage$Updater.commit(WorldStateKeyValueStorage.java:206) at org.hyperledger.besu.ethereum.eth.sync.fastsync.worldstate.PersistDataStep.persist(PersistDataStep.java:53) at org.hyperledger.besu.ethereum.eth.sync.fastsync.worldstate.FastWorldStateDownloadProcess$Builder.lambda$build$3(FastWorldStateDownloadProcess.java:202) at org.hyperledger.besu.services.pipeline.MapProcessor.processNextInput(MapProcessor.java:31) at org.hyperledger.besu.services.pipeline.ProcessingStage.run(ProcessingStage.java:38) at org.hyperledger.besu.services.pipeline.Pipeline.lambda$runWithErrorHandling$3(Pipeline.java:152) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.rocksdb.RocksDBException: Busy at org.rocksdb.Transaction.commit(Native Method) at org.rocksdb.Transaction.commit(Transaction.java:206) at org.hyperledger.besu.plugin.services.storage.rocksdb.segmented.RocksDBColumnarKeyValueStorage$RocksDbTransaction.commit(RocksDBColumnarKeyValueStorage.java:272) ... 13 more
Acceptance Criteria
- Fast sync finishes downloading World State after facing the above exception.
Steps to Reproduce (Bug)
This exception is difficult to reproduce because it does not occur at every Fast sync execution. For my case, I started Fast Sync on Goerli several times, and I got the exception two times.
Expected behavior:
Recover from the exception and continue downloading World State.
Actual behavior:
Stop downloading the world state while importing blocks continues its execution.
Frequency:
Sometimes
Versions (Add all that apply)
-
Software version: 22.1.1
-
Java version: OpenJDK 64-Bit Server VM Corretto-11.0.14.10.1 (build 11.0.14.1+10-LTS, mixed mode)
-
OS Name & Version: NAME="Amazon Linux", VERSION="2", ID="amzn", ID_LIKE="centos rhel fedora", VERSION_ID="2", PRETTY_NAME="Amazon Linux 2", ANSI_COLOR="0;33", Amazon Linux release 2 (Karoo)
-
Kernel Version: Linux ip-10-0-2-103.us-east-2.compute.internal 4.14.232-177.418.amzn2.x86_64 SMP Tue Jun 15 20:57:50 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
-
Docker Version: Client/Server : 20.10.7
-
Cloud VM, type, size: Amazon Web Services i3.xlarge
Additional Information (Add any of the following or anything else that may be relevant)
- Besu setup info : network=goerli
- System info - memory, CPU : 4 vCPU, 30.5 GB RAM, 950 GB NVMe SSD, up to 10 Gbps Network
Activity