The Plutarch Trie project provides a Plutarch-based implementation of Distributed Tries for the Cardano blockchain. This project allows developers to leverage the security and efficiency of Tries in their Cardano smart contracts, ensuring data integrity and efficient data verification. This project uniquely allows scalable data structures across multiple utxos, with a developer-friendly typescript api.
This project is funded by the Cardano Treasury in Catalyst Fund 10 and is aimed at enhancing the capabilities of Cardano smart contracts in handling complex data structures.
A trie, also known as a prefix tree or digital tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. This makes tries extremely useful for applications like autocomplete systems, spell checkers, and IP routing. Here's a detailed explanation:
A Trie is a type of search tree, an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. Here's how it's structured:
- Leaf Nodes: These are the nodes at the bottom of the tree that contain values associated with the keys. The keys are formed by the path from the root to the leaf.
- Non-Leaf (Intermediate) Nodes: These nodes contain the common prefixes of the keys or parts of them. They help in reducing the search space for a query.
- Root Node: The single node at the top of the tree represents the starting point of every key stored in the trie. It is usually empty.
The core of a Trie is its key encoding mechanism. This mechanism takes input keys of any size and encodes them into a path through the Trie. Each character or byte of the key represents a step down the Trie, from the root towards the leaves.
- Inserting Keys: Keys are inserted by walking through the Trie according to the encoded path of the key. If a path does not exist, new nodes are created to accommodate it.
- Node Structure: Each node in the Trie can have several children, each representing a possible continuation of the key. The value associated with a key is stored in the leaf node at the end of its path.
- Prefix Sharing: Nodes share common prefixes, which makes Tries highly space-efficient for datasets with keys that share common prefixes.
- Efficiency in Search Operations: Tries allow for efficient search operations, including lookups, insertions, and deletions, all with time complexity proportional to the length of the key.
- Prefix Matching: Tries excel at prefix matching, allowing for quick searches of all keys that share a common prefix, which is useful for autocomplete systems and spell checkers.
- Space Efficiency: By sharing common prefixes among keys, Tries use space more efficiently than other data structures like hash tables, especially when the dataset contains many similar keys.
Consider a Trie with the keys "car", "cat", and "dog": The structure of the Trie after inserting the keys "car", "cat", and "dog" would look something like this:
graph TD
ROOT((Root)) -->|c| C(C)
C -->|a| A(A)
A -->|r| R(R)
A -->|t| T(T)
ROOT -->|d| D(D)
D -->|o| O(O)
O -->|g| G(G)
ROOT -->|other| OTH((Others))
Imagine the Trie as a tree where each node represents a character. The root node is empty and branches out to three paths: one for "c", one for "d", and potentially others for different starting letters of keys not shown in this example. The "c" node branches into "a", which further branches into "r" and "t" to form the words "car" and "cat". Each of these nodes, "r" and "t", would be leaf nodes for "car" and "cat", respectively, possibly containing values or simply marking the end of the word. Similarly, the "d" node branches into "o", which then branches into "g", forming the word "dog" with "g" as its leaf node. This structure allows for efficient searching, adding, and deleting of keys by following the branches corresponding to each character in the key.
This visual representation helps understand how Tries optimize space and search time, especially with a large number of keys sharing common prefixes. By sharing the initial "ca" in "car" and "cat", the Trie saves space compared to storing each word independently. This efficiency becomes more pronounced with a larger dataset with more shared prefixes.
The Plutarch Trie implementation is a smart contract solution that allows the creation of many distributed trie's from a single validator.
Before you begin, ensure you have Nix installed on your system. Nix is used for package management and to provide a consistent development environment. To install run the following command:
sh <(curl -L https://nixos.org/nix/install) --daemon
and follow the instructions.
$ nix --version
nix (Nix) 2.18.1
Make sure to enable Nix Flakes by editing either ~/.config/nix/nix.conf
or /etc/nix/nix.conf
on
your machine and add the following configuration entries:
experimental-features = nix-command flakes ca-derivations
allow-import-from-derivation = true
Optionally, to improve build speed, it is possible to set up binary caches maintained by IOHK and Plutonomicon by setting additional configuration entries:
substituters = https://cache.nixos.org https://iohk.cachix.org https://cache.iog.io
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= iohk.cachix.org-1:DpRUyj7h7V830dp/i6Nti+NEO2/nhblbov/8MW7Rqoo=
To facilitate seamlessly moving between directories and associated Nix development shells we use direnv and nix-direnv:
Your shell and editors should pick up on the .envrc
files in different directories and prepare the environment accordingly. Use direnv allow
to enable the direnv environment and direnv reload
to reload it when necessary. Otherwise, the .envrc
file contains a proper Nix target you can use with the nix develop
command.
To install both using nixpkgs
:
nix profile install nixpkgs#direnv
nix profile install nixpkgs#nix-direnv
You should be able to seamlessly use the repository to develop, build and run plutarch-trie.
Download the Git repository:
git clone https://github.com/Anastasia-Labs/plutarch-trie.git
Navigate to the repository directory:
cd plutarch-trie
direnv allow
Activate the development environment with Nix:
nix develop .
Additionally, when you run nix run .#help
you'll get a list of scripts you can run, the Github CI (nix flake check) is setup in a way where it checks the project builds successfully, haskell format is done correctly, and commit message follows conventional commits. Before pushing you should run cabal run
, nix run .#haskellFormat
(automatically formats all haskell files, including cabal), if you want to commit a correct format message you can run cz commit
Build:
cabal run build
Execute the test suite:
cabal test
To integrate the Plutarch Trie library into your project, start by creating a cabal.project
file, as this file is not automatically generated during the cabal init process. Then, proceed by including the following:
source-repository-package
type: git
location: git://github.com/Anastasia-Labs/plutarch-trie.git
Then in the your [name of your project].cabal
file include the repository under the build-depends section
build-depends:
base ^>= 4.11.1.0
, plutarch-trie