Description
Specification
Indexing is the act of creating a separate data structure that makes it faster to lookup data on a primary data structure.
PK has several domains that require secondary indexing:
- acl - perm id, vault id, node id
- gestalts - discovery, adjacency list indexing
- nodes - possibly needed in the future for all sorts of optimal routing needs (kadmelia indexing?)
- notifications - to be able to filter notifications
- sigchain - to be able to filter sigchain
- vaults - for vault tagging and vault names
Right now secondary indexing is implemented manually in ACL by persisting bidirectional maps in leveldb.
Indexing is tricky topic, and it's better to create a single indexing mechanism in DB so that all domains in PK can benefit from them.
Generic indexing can reply on LevelDB's eventemitter interface. The good thing is that other people have already created good libraries for this:
- https://github.com/hypermodules/level-auto-index - is compatible with subleveldown, and uses "hooks" to abstract the leveldb eventemitter interface
- https://github.com/hypermodules/level-idx - this uses level-auto-index
The second library automates some of level-auto-index
. In fact it binds secondary index interfaces to the leveldb object.
For @matrixai/db
it makes more sense to use level-auto-index
so we can more tightly control how the indexes are stored. Most likely via sublevel interfaces.
Additional context
- Indexing Domains and DB Levels for Faster & Flexible Lookup Polykey#188 - further details on applying indexing to sigchain operations
- http://www.tiernok.com/posts/adding-index-for-a-key-value-store/
Tasks
- - Investigate possible dependencies to automate secondary indexing and their relationship
- - Prototype the usage of
level-auto-index
[ ] - Integrate thelevel-auto-index
into this library's domain/level creation, each level including the root level should have the ability of attaching a secondary index (or multiple secondary indexes)