-
Notifications
You must be signed in to change notification settings - Fork 927
Description
Description
In our quest against database corruption, I think we should start calling fsync
in more places. The problem is that this could cause serious performance degradation if we do it too liberally.
This PR for tree-states (#5144) adds some fsyncs to prevent issues during backfill. I suspect that fsync will be most useful when coordinating changes across multiple databases e.g. hot & freezer, or hot & blobs. On stable
we currently only have one synchronisation point, which is during the hot -> cold database migration:
lighthouse/beacon_node/store/src/hot_cold_store.rs
Lines 2487 to 2498 in c7e5dd1
// Warning: Critical section. We have to take care not to put any of the two databases in an | |
// inconsistent state if the OS process dies at any point during the freezing | |
// procedure. | |
// | |
// Since it is pretty much impossible to be atomic across more than one database, we trade | |
// losing track of states to delete, for consistency. In other words: We should be safe to die | |
// at any point below but it may happen that some states won't be deleted from the hot database | |
// and will remain there forever. Since dying in these particular few lines should be an | |
// exceedingly rare event, this should be an acceptable tradeoff. | |
// Flush to disk all the states that have just been migrated to the cold store. | |
store.cold_db.sync()?; |
This synchronisation was added by @adaszko, and I reckon it has probably saved us from many more instances of DB corruption.
Steps to resolve
- Add fsyncs during all database transactions that involve >1 database. Likely candidates for a re-work are
do_atomically_with_block_and_blobs_cache
and backfill (the tree-states PR). - Measure the impact of these
fsync
additions on performance. For backfill the metric is #blocks/sec speed, although this is a little hard to control when network factors also play a role (we could import Era files to remove this variability). - (Optional) Consider adding more
fsync
s after database operations involving a single database. The "early attester cache" in block processing should mean that it's OK for us to spend a little more time on the database write (it is off the hot path).