-
Notifications
You must be signed in to change notification settings - Fork 34
Description
The V2 LedgerDB (introduced as part of UTxO HD in #1412) maintains a LedgerSeq at its core:
Lines 140 to 142 in d582af5
| newtype LedgerSeq m l = LedgerSeq { | |
| getLedgerSeq :: AnchoredSeq (WithOrigin SlotNo) (StateRef m l) (StateRef m l) | |
| } deriving (Generic) |
Lines 119 to 122 in d582af5
| data StateRef m l = StateRef { | |
| state :: !(l EmptyMK) | |
| , tables :: !(LedgerTablesHandle m l) | |
| } deriving (Generic) |
Now, when we prune a LedgerSeq, we want to close the LedgerTablesHandles from the prefix that we are removing. The current code tries to do this, it it is not effective. Namely, prune returns the LedgerSeq prefix containing states to prune:
Lines 230 to 238 in d582af5
| prune (LedgerDbPruneKeeping (SecurityParam k)) (LedgerSeq ldb) = | |
| if toEnum nvol <= unNonZero k | |
| then (LedgerSeq $ Empty (AS.anchor ldb), LedgerSeq ldb) | |
| else | |
| -- We remove the new anchor from the @fst@ component so that its handle is | |
| -- not closed. | |
| B.bimap (LedgerSeq . dropNewest 1) LedgerSeq $ AS.splitAt (nvol - fromEnum (unNonZero k)) ldb | |
| where | |
| nvol = AS.length ldb |
And then in other places, we try to close the handles:
Lines 154 to 155 in d582af5
| let (l, s) = prune (LedgerDbPruneKeeping (foeSecurityParam env)) (LedgerSeq newdb) | |
| pure ((toClose, l), s) |
Line 163 in d582af5
| mapM_ (close . tables) $ AS.toOldestFirst discardedBySelection ++ AS.toOldestFirst discardedByPruning |
Lines 186 to 187 in 2c06471
| closeLedgerSeq :: Monad m => LedgerSeq m l -> m () | |
| closeLedgerSeq = mapM_ (close . tables) . toOldestFirst . getLedgerSeq |
The crucial observation is that toOldestFirst does not return the anchor of the LedgerSeq/AnchoredSeq, so we do not close this. In particular, in the very common case of adopting one block (such that the immutable tip also advances by one block), which is the common case both while syncing and caught-up, we do not close anything.
The fix is simply to also consider the anchor of the sequence of handles to close.
Currently, this isn't problematic as the only V2 backend is the in-memory backend at the moment, and for it, closing handles doesn't matter (as the GC will still claim the ledger state once it is no longer referenced). However, in the upcoming LSM tree backend, this will be crucial.
Concrete "demonstration" of the bug:
Apply this diff on d582af5 (main at time of writing):
Show diff
diff --git a/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/InMemory.hs b/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/InMemory.hs
index 770a85862d..c4e2ac3115 100644
--- a/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/InMemory.hs
+++ b/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/InMemory.hs
@@ -90,7 +90,8 @@ newInMemoryLedgerTablesHandle ::
newInMemoryLedgerTablesHandle someFS@(SomeHasFS hasFS) l = do
!tv <- newTVarIO (LedgerTablesHandleOpen l)
pure LedgerTablesHandle {
- close =
+ close = do
+ !_ <- error "foo"
atomically $ writeTVar tv LedgerTablesHandleClosed
, duplicate = do
hs <- readTVarIO tv
diff --git a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine.hs b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine.hs
index 5015d44106..33f842b615 100644
--- a/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine.hs
+++ b/ouroboros-consensus/test/storage-test/Test/Ouroboros/Storage/LedgerDB/StateMachine.hs
@@ -270,14 +270,14 @@ instance StateModel Model where
arbitraryAction _ model@(Model chain secParam) =
frequency $ [ (2, pure $ Some GetState)
, (2, pure $ Some ForceTakeSnapshot)
- , (1, Some . DropAndRestore <$> QC.choose (0, fromIntegral $ AS.length chain))
+ , (0, Some . DropAndRestore <$> QC.choose (0, fromIntegral $ AS.length chain))
, (4, Some <$> do
let maxRollback =
min
(fromIntegral . AS.length $ chain)
(BT.unNonZero $ maxRollbacks secParam)
- numRollback <- QC.choose (0, maxRollback)
- numNewBlocks <- QC.choose (numRollback, numRollback + 2)
+ numRollback <- pure 0
+ numNewBlocks <- pure 1
let
chain' = case modelRollback numRollback model of
UnInit -> error "Impossible"Then, running cabal run storage-test -- -p InMemV2 succeeds, even though we would expect it to fail if we would actually close the handles.
Concretely, this patch
- modifies the close logic of the
LedgerTablesHandleto always fail with an imprecise exception, and - modifies the LedgerDB to only generate
ValidateAndCommitcommands that add exactly one block, and disablesDropAndRestore.
Then, if V2 were correctly closing pruned LedgerTablesHandles, then we would expect the test to fail; however, it passes just fine. Note that it does fail when keeping the test unmodified, showing that we at least sometimes close LedgerTablesHandles.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status