Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caplin: Introduced cached merkle tree #11096

Merged
merged 18 commits into from
Jul 30, 2024
Merged

Caplin: Introduced cached merkle tree #11096

merged 18 commits into from
Jul 30, 2024

Conversation

Giulio2002
Copy link
Contributor

@Giulio2002 Giulio2002 commented Jul 10, 2024

Merkle Tree cache for Intermediate hashes

this PR implements a cached Merkle tree for ARM-based computers which do not have significant gains from gohashtree.

Overview

Merkle Tree Structure

The MerkleTree struct contains:

  • computeLeaf: A function to compute the hash of a leaf node.
  • layers: A slice of byte slices representing the intermediate layers of the tree.
  • leavesCount: The number of leaves in the tree.
  • hashBuf: A buffer to store input for hashing.
  • limit: An optional limit for the number of leaves.

Key Constants

  • OptimalMaxTreeCacheDepth: A constant defining the maximum depth of the tree cache.

Initialization

The Initialize method sets up the Merkle tree with a specified number of leaves and a maximum cache depth. It also initializes the layers and sets an optional limit for the number of leaves.

Marking Leaves as Dirty

The MarkLeafAsDirty method marks a leaf node as "dirty", indicating it needs to be recomputed during the next root calculation.

Adding Leaves

The AppendLeaf method allows adding a new leaf to the tree. It marks the new leaf as dirty and extends the layers as needed.

Extending Layers

The extendLayer method extends a given layer by 1.5 times its current size, ensuring the new leaf is marked as dirty.

Computing the Root

The ComputeRoot method calculates the root hash of the Merkle tree. It handles various cases:

  • No layers or leaves: Returns a predefined zero hash.
  • Few leaves: Computes the root directly from the leaves.
  • Larger trees: Computes intermediate layers to derive the root hash.

Copying the Tree

The CopyInto method copies the current Merkle tree into another instance.

Internal Helper Methods

  • finishHashing: Completes the hashing process for the last layer.
  • computeLayer: Computes the hash for a specific layer.

High-Level Workflow

  1. Initialization: Set up the tree with the desired number of leaves and depth.
  2. Mark Leaves: Mark leaves as dirty when they need to be recomputed.
  3. Append Leaves: Add new leaves to the tree, extending layers as necessary.
  4. Compute Root: Calculate the root hash of the tree, which can be used to verify the integrity of all leaves.
  5. Copy Tree: Create a copy of the tree if needed.

Intermediate Hash Storage

The implementation stores only intermediate hashes, not the leaves themselves. This is achieved through the following:

  • Intermediate Layers: The layers field holds intermediate hashes between the leaves and the root. Each entry in layers represents a level in the Merkle tree, starting from the first intermediate layer up to the layer just before the root.
  • Leaf Calculation: The computeLeaf function is used to compute the hash of a leaf node when needed, but these leaf hashes are not stored directly in the layers.
  • Root Calculation: When computing the root, the tree uses the intermediate hashes stored in layers to derive the final root hash. This ensures that only the necessary intermediate hashes are kept in memory, optimizing storage efficiency.

Rationale for Intermediate Hash Storage

The primary rationale for storing only intermediate hashes and not recomputing leaves is efficiency. By retaining intermediate hashes:

  • Reduced Computation: The tree avoids recomputing leaf hashes unless absolutely necessary, saving computational resources.
  • Optimized Storage: Only the hashes essential for reconstructing the root are stored, minimizing memory usage.
  • Performance Gains: This approach leads to faster root computation, especially in trees with a large number of leaves.

@Giulio2002 Giulio2002 force-pushed the merkle-tree branch 2 times, most recently from f934ec0 to 764263e Compare July 11, 2024 22:21
@Giulio2002 Giulio2002 marked this pull request as draft July 12, 2024 19:00
@Giulio2002 Giulio2002 marked this pull request as ready for review July 13, 2024 17:52
@Giulio2002 Giulio2002 modified the milestones: 3.0.0-beta1, Caplin Tickets (E3 production release) Jul 15, 2024
cl/merkle_tree/merkle_tree.go Outdated Show resolved Hide resolved
cl/merkle_tree/merkle_tree.go Show resolved Hide resolved
cl/cltypes/solid/hash_list.go Outdated Show resolved Hide resolved
cl/cltypes/solid/validator_set.go Outdated Show resolved Hide resolved
cl/merkle_tree/merkle_tree.go Show resolved Hide resolved
@Giulio2002
Copy link
Contributor Author

I will address comments and all after the Erigon 3.0 release

Giulio2002 and others added 4 commits July 20, 2024 00:50
Co-authored-by: Kewei <kewei.train@gmail.com>
git add .
Copy link

@Giulio2002
Copy link
Contributor Author

@domiwei all done, review again.

@domiwei
Copy link
Member

domiwei commented Jul 28, 2024

@domiwei all done, review again.

lgtm, but i noticed that seems this cached merkle tree struct is not thread-safe. Maybe you need mutex to protect it?

@Giulio2002 Giulio2002 merged commit 3d37d78 into main Jul 30, 2024
10 checks passed
@Giulio2002 Giulio2002 deleted the merkle-tree branch July 30, 2024 22:24
@Giulio2002
Copy link
Contributor Author

@domiwei all done, review again.

lgtm, but i noticed that seems this cached merkle tree struct is not thread-safe. Maybe you need mutex to protect it?

No issue, not supposed to

@VBulikov VBulikov removed this from the Caplin Tickets (E3 production release) milestone Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants