Skip to content

Corrupted mdbx after initial sync crashed #11163

Closed
@kovalishinilya

Description

System information

Erigon version: 2.60.4-72ab70be

OS & Version: Windows/Linux/OSX

Commit hash: 72ab70b

Erigon Command (with flags/config):

erigon --chain mainnet --datadir <datadir> --internalcl 

Consensus Layer: caplin

Consensus Layer Command (with flags/config):

Chain/Network: mainnet

Expected behaviour

start as usual

Actual behaviour

experiencing this error after unexpected crush during last bits of initial syncing:

mdbx_setup_dxb:16114 opening after an unclean shutdown, but boot-id(89800aff74330ce3-00574f21b3d1498f) is MATCH: rollback NOT needed, steady-sync NEEDED
meta_checktxnid:11400 catch invalid root_page_txnid 0 for freedb.mod_txnid 18152 (workaround for incoherent flaw of unified page/buffer cache)
meta_checktxnid:11415 catch invalid root_page_txnid 18136 for maindb.mod_txnid 18152 (workaround for incoherent flaw of unified page/buffer cache)
meta_waittxnid:11454 bailout waiting for valid snapshot (workaround for incoherent flaw of unified page/buffer cache)
EROR[07-15|16:12:00.308] Erigon startup                           err="mdbx_txn_begin: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com to test RAM and tools like https://www.smartmontools.org to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled.  Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: txpool, trace: [kv_mdbx.go:363 all_components.go:123 backend.go:630 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:271 asm_amd64.s:1695]"
mdbx_txn_begin: MDBX_CORRUPTED: Maybe free space is over on disk. Otherwise it's hardware failure. Before creating issue please use tools like https://www.memtest86.com to test RAM and tools like https://www.smartmontools.org to test Disk. To handle hardware risks: use ECC RAM, use RAID of disks, run multiple application instances (or do backups). If hardware checks passed - check FS settings - 'fsync' and 'flock' must be enabled.  Otherwise - please create issue in Application repo. On default DURABLE mode, power outage can't cause this error. On other modes - power outage may break last transaction and mdbx_chk can recover db in this case, see '-t' and '-0|1|2' options., label: txpool, trace: [kv_mdbx.go:363 all_components.go:123 backend.go:630 node.go:124 main.go:66 make_app.go:54 command.go:276 app.go:333 app.go:307 main.go:34 proc.go:271 asm_amd64.s:1695]

SSD is fully functional and has around 500 GB of free space

crash logs are not full but it is what it is, i'll try to provide more info :(
crash.log

Steps to reproduce the behaviour

restart erigon (works with and without consensus layer and db size limit command)

Backtrace

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions