Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ jobs:
cabal clean
cabal update

- name: Install lmdb
run: |
sudo apt update
sudo apt install liblmdb-dev

# We create a `dependencies.txt` file that can be used to index the cabal
# store cache.
#
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
/docs/website/build/
/ouroboros-consensus/docs/haddocks/

haddocks/

# GHC
.ghcid
.ghc.environment.*
Expand Down
35 changes: 35 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,41 @@ cabal test ouroboros-consensus:test:consensus-test --test-show-details=direct
Note the second one cannot be used when we want to provide CLI arguments to the
test-suite.

# Generating documentation and setting up hoogle

The documentation contains some [tikz](https://tikz.net) figures that require
some preprocessing for them to be displayed. To do this, use the documentation
script:

```bash
./scripts/docs/haddocks.sh
```

If not already in your `PATH` (eg when in a Nix shell), this will install
[`cabal-docspec`](https://github.com/phadej/cabal-extras/tree/master/cabal-docspec)
from a binary, and then build the haddocks for the project.

Often times, it is useful to have a
[`hoogle`](https://github.com/ndmitchell/hoogle) server at hand, with the
packages and its dependencies. Our suggestion is to install
[`cabal-hoogle`](https://github.com/kokobd/cabal-hoogle) from github:

```bash
git clone git@github.com:kokobd/cabal-hoogle
cd cabal-hoogle
cabal install exe:cabal-hoogle
```

and then run `cabal-hoogle`:

```bash
cabal-hoogle generate
cabal-hoogle run -- server --local
```

This will fire a `hoogle` server at https://localhost:8080/ with the local
packages and their dependencies.

# Contributing to the code

The following sections contain some guidelines that should be followed when
Expand Down
4 changes: 2 additions & 2 deletions cabal.project
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ repository cardano-haskell-packages
-- update either of these.
index-state:
-- Bump this if you need newer packages from Hackage
, hackage.haskell.org 2025-04-06T22:39:33Z
, hackage.haskell.org 2025-04-08T10:52:25Z
-- Bump this if you need newer packages from CHaP
, cardano-haskell-packages 2025-04-07T00:07:03Z
, cardano-haskell-packages 2025-04-08T11:09:22Z

packages:
ouroboros-consensus
Expand Down
92 changes: 1 addition & 91 deletions docs/tech-reports/report/chapters/storage/ledgerdb.tex
Original file line number Diff line number Diff line change
@@ -1,98 +1,8 @@
\chapter{Ledger Database}
\label{ledgerdb}

The Ledger DB is responsible for the following tasks:

\begin{enumerate}
\item \textbf{Maintaining the ledger state at the tip}: Maintaining the ledger
state corresponding to the current tip in memory. When we try to extend our
chain with a new block fitting onto our tip, the block must first be validated
using the right ledger state, i.e., the ledger state corresponding to the tip.
The current ledger state is needed for various other purposes.

\item \textbf{Maintaining the past $k$ ledger states}: As discussed in
\cref{consensus:overview:k}, we might roll back up to $k$ blocks when
switching to a more preferable fork. Consider the example below:
%
\begin{center}
\begin{tikzpicture}
\draw (0, 0) -- (50pt, 0) coordinate (I);
\draw (I) -- ++(20pt, 20pt) coordinate (C1) -- ++(20pt, 0) coordinate (C2);
\draw (I) -- ++(20pt, -20pt) coordinate (F1) -- ++(20pt, 0) coordinate (F2) -- ++(20pt, 0) coordinate (F3);
\node at (I) {$\bullet$};
\node at (C1) {$\bullet$};
\node at (C2) {$\bullet$};
\node at (F1) {$\bullet$};
\node at (F2) {$\bullet$};
\node at (F3) {$\bullet$};
\node at (I) [above left] {$I$};
\node at (C1) [above] {$C_1$};
\node at (C2) [above] {$C_2$};
\node at (F1) [below] {$F_1$};
\node at (F2) [below] {$F_2$};
\node at (F3) [below] {$F_3$};
\draw (60pt, 50pt) node {$\overbrace{\hspace{60pt}}$};
\draw (60pt, 60pt) node[fill=white] {$k$};
\draw [dashed] (30pt, -40pt) -- (30pt, 45pt);
\end{tikzpicture}
\end{center}
%
Our current chain's tip is $C_2$, but the fork containing blocks $F_1$, $F_2$,
and $F_3$ is more preferable. We roll back our chain to the intersection point
of the two chains, $I$, which must be not more than $k$ blocks back from our
current tip. Next, we must validate block $F_1$ using the ledger state at
block $I$, after which we can validate $F_2$ using the resulting ledger state,
and so on.

This means that we need access to all ledger states of the past $k$ blocks,
i.e., the ledger states corresponding to the volatile part of the current
chain.\footnote{Applying a block to a ledger state is not an invertible
operation, so it is not possible to simply ``unapply'' $C_1$ and $C_2$ to
obtain $I$.}

Access to the last $k$ ledger states is not only needed for validating candidate
chains, but also by the:
\begin{itemize}
\item \textbf{Local state query server}: To query any of the past $k$ ledger
states (\cref{servers:lsq}).
\item \textbf{Chain sync client}: To validate headers of a chain that
intersects with any of the past $k$ blocks
(\cref{chainsyncclient:validation}).
\end{itemize}

\item \textbf{Storing on disk}: To obtain a ledger state for the current tip of
the chain, one has to apply \emph{all blocks in the chain} one-by-one to the
initial ledger state. When starting up the system with an on-disk chain
containing millions of blocks, all of them would have to be read from disk and
applied. This process can take tens of minutes, depending on the storage and
CPU speed, and is thus too costly to perform on each startup.

For this reason, a recent snapshot of the ledger state should be periodically
written to disk. Upon the next startup, that snapshot can be read and used to
restore the current ledger state, as well as the past $k$ ledger states.
\end{enumerate}

Note that whenever we say ``ledger state'', we mean the
\lstinline!ExtLedgerState blk! type described in \cref{storage:extledgerstate}.

The above duties are divided across the following modules:

\begin{itemize}
\item \lstinline!LedgerDB.InMemory!: this module defines a pure data structure,
named \lstinline!LedgerDB!, to represent the last $k$ ledger states in memory.
Operations to validate and append blocks, to switch to forks, to look up
ledger states, \ldots{} are provided.
\item \lstinline!LedgerDB.OnDisk!: this module contains the functionality to
write a snapshot of the \lstinline!LedgerDB! to disk and how to restore a
\lstinline!LedgerDB! from a snapshot.
\item \lstinline!LedgerDB.DiskPolicy!: this module contains the policy that
determines when a snapshot of the \lstinline!LedgerDB! is written to disk.
\item \lstinline!ChainDB.Impl.LgrDB!: this module is part of the Chain DB, and
is responsible for maintaining the pure \lstinline!LedgerDB! in a
\lstinline!StrictTVar!.
\end{itemize}

We will now discuss the modules listed above.
THIS PART WAS PORTED TO THE HADDOCKS

\section{In-memory representation}
\label{ledgerdb:in-memory}
Expand Down
28 changes: 28 additions & 0 deletions docs/website/contents/for-developers/utxo-hd/Overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# High level overview of UTxO-HD

UTxO-HD is an internal rework of the Consensus layer that features a hybrid
database for Ledger State data. UTxOs are stored in a separate database that
can be backed by an on-disk database or with an in-memory implementation.

Each of those backends have specific behaviors and implications, so we will
refer to them individually by `InMemory` and `OnDisk`.

End-users of the `InMemory` backend (the default one) should not appreciate any
major difference in behavior and performance with respects to a pre-UTxO-HD
node.

End-users of the `OnDisk` backend will observe a regression in performance. For
now the `OnDisk` backend is implemented via LMDB and not optimal in terms of
performance, but we plan on making use of the LSM trees library that Well-Typed
is developing for a much better performance. In particular operations that need
UTxOs (applying blocks/transactions) will have the overhead of a trip to the
disk storage plus some calculations to bring the disk values up to date to the
tip of the chain.

In exchange for that performance regression, a Cardano node using the `OnDisk`
backend can run with much more modest memory requirements than a pre-UTxO-HD
node.

In terms of functionality, both backends are fully functional.

For a more extensive description of UTxO-HD, see [the full documentation](./utxo-hd-in-depth).
Loading
Loading