Skip to content

mwiencek/weight-balanced-tree

Repository files navigation

weight-balanced-tree

A persistent weight-balanced (bounded balance) tree for JavaScript.

WBTs can be used to implement persistent maps or sets, where the entries have a defined order. Indexed access is also O(log n) due to the fact that indices can be located using the size stored on each node; therefore they're also a viable option for implementing persistent lists if you don't need O(1) reads or writes at the head or tail of the list.

  • Zero dependencies
  • Trees are plain objects
  • Works in Node.js and the browser
  • Flow and TypeScript definitions included

Installation

This software is released under the MIT license.

It's published on npm as weight-balanced-tree,

npm install weight-balanced-tree

# or using yarn
yarn add weight-balanced-tree
import * as tree from 'weight-balanced-tree';

// or import functions directly
import insert from 'weight-balanced-tree/insert';

API

All API functions are pure and side‑effect free; the tree structure is treated as immutable and a structurally‑shared copy is returned when modifications are required.

Many functions of this library require a comparator cmp, which defines the ordering of the tree. It works in the same way as the comparator passed to Array.prototype.sort. Obviously, the comparator should be idempotent and behave consistently for a particular tree, otherwise the tree can become invalid.

empty

const empty: EmptyImmutableTree;

The empty tree.

Note: The correct way to check if a tree is empty is with tree.size === 0. The empty object reference won't be consistent across realms or in cases where a tree is parsed from JSON.

create()

function create<T>(value: T): ImmutableTree<T>;

Creates a tree of size 1 containing value.

fromDistinctAscArray()

function fromDistinctAscArray<T>(
  array: $ReadOnlyArray<T>,
): ImmutableTree<T>;

Constructs a new weight-balanced tree from array. array must be sorted and contain only distinct values. (This is faster than building the tree one value at a time with insert().)

fromDistinctAscArray will create an invalid tree if array is not sorted or contains duplicate values. You can check if a tree is valid using validate().

update()

type InsertConflictHandler<T, K> =
  (existingTreeValue: T, key: K) => T;

type InsertNotFoundHandler<T, K> =
  (key: K) => T;

type UpdateOptions<T, K> = {
  key: K,
  cmp: (key: K, treeValue: T) => number,
  isEqual?: (a: T, b: T) => boolean,
  onConflict: InsertConflictHandler<T, K>,
  onNotFound: InsertNotFoundHandler<T, K>,
};

function update<T, K>(
    tree: ImmutableTree<T>,
    options: UpdateOptions<T, K>,
): ImmutableTree<T>;

A flexible primitive that can insert, replace, and even remove values in tree.

key is located using the comparator cmp, which receives the same key as its first argument, and a value of type T from tree as its second argument.

  • If key exists, onConflict is expected to return a final value to live at that position. It receives the existing tree value as its first argument, and key as its second argument.

    The returned value must have the same relative position in the tree as before, otherwise a ValueOrderError is thrown.

    If you return existingTreeValue from onConflict, update will return the same tree reference back. Object.is is used to determine value equality.

    There are several predefined exports in weight-balanced-tree/update that can be used for onConflict:

    • onConflictThrowError, which throws ValueExistsError.
    • onConflictKeepTreeValue, which returns the existing tree value.
    • onConflictUseGivenValue, which returns key. (This is only usable in cases where K is a subtype of T.)
    • onConflictRemoveValue, which causes update to remove the value stored at key from tree.
  • If key doesn't exist, onNotFound is invoked to lazily create or reject the missing value. It only receives one argument: the key passed to update. Like onConflict, it's expected to return a value to be inserted or to throw an error.

    The following predefined exports in weight-balanced-tree/update can be used for onNotFound:

    • onNotFoundUseGivenValue, which returns key. (This is only usable in cases where K is a subtype of T.)
    • onNotFoundDoNothing, which causes update to perform no insertion and to return the same tree reference back.
    • onNotFoundThrowError, which throws a ValueNotFoundError.

K and T can be different types. That's convenient when using the tree as a map:

const cmp = (key1, [key2]) => key1 - key2;

// key = 1, value = 0
let node = tree.create([1, 0]);

// increments the value stored at key 1, or initializes it to 0
node = tree.update(node, {
  key: 1,
  cmp,
  onConflict: ([, value], key) => [key, value + 1],
  onNotFound: (key) => [key, 0],
});

And here's a "find or insert" implementation:

interface Item {
  readonly key: number;
  readonly value: string;
}

function compareKeyWithItemKey(key: number, item: Item): number {
  return key - item.key;
}

function findOrInsert(tree, key) {
  let item;
  const newTree = update<Item, number>(tree, {
    key,
    cmp: compareKeyWithItemKey,
    onConflict: (existingItem: Item) => {
      item = existingItem;
      return existingItem;
    },
    onNotFound: (key: number) => {
      item = {key, value: 'initialValue'};
      return item;
    },
  });
  return [newTree, item];
}

For simpler insertion semantics, see insert() below.

insert()

function insert<T>(
    tree: ImmutableTree<T>,
    value: T,
    cmp: (T, T) => number,
    onConflict?: InsertConflictHandler<T, T>,
): NonEmptyImmutableTree<T>;

Returns a new version of tree with value inserted. This is a more specific version of update() that only operates on the value type T.

cmp is the same as with update, except the first argument received is the value you passed, and both arguments are of type T.

onConflict is also the same as with update, but here it defaults to onConflictThrowError if not specified.

There are also some additional exports available that call insert with different values of onConflict for you:

  • insertIfNotExists (passes onConflictKeepTreeValue)
  • insertOrReplaceIfExists (passes onConflictUseGivenValue)

insertOrThrowIfExists is an alias of insert.

remove()

function remove<T, K>(
    tree: ImmutableTree<T>,
    key: K,
    cmp: (key: K, treeValue: T) => number,
): ImmutableTree<T>;

Returns a new version of tree with the value located by key removed.

If key is not found in the tree, the same tree reference is returned back.

If this was the last value in tree, empty is returned.

The cmp function works the same as with update(), with key being passed as the first argument.

removeIfExists is an alias of remove.

removeOrThrowIfNotExists()

Like remove(), but throws an error if value does not exist in the tree.

This simply checks if the tree returned from remove is the same reference.

equals()

function equals<T, U = T>(
  a: ImmutableTree<T>,
  b: ImmutableTree<U>,
  isEqual?: (a: T, b: U) => boolean,
): boolean;

Returns true if two trees contain the same values in the same order, or false otherwise.

This works by zipping the trees' values together, and passing each pair of values to isEqual.

isEqual is optional. If not provided, it defaults to Object.is.

exists()

function exists<T, K = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
): boolean;

Returns true if a value matching key exists in tree, or false otherwise.

This is a convenience function that just checks if findNode() returns null.

filter()

function filter<T>(
  tree: ImmutableTree<T>,
  predicate: (value: T) => boolean,
): ImmutableTree<T>;

Returns a tree containing only the values that satisfy predicate.

find()

function find<T, K = T, D = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
  defaultValue: D,
): T | D;

Finds a value in tree using the given key and returns it, or defaultValue if not found.

cmp receives key as its first argument, and a value of type T from tree as its second argument.

findAll()

function findAll<T, K = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
): Generator<T, void, void>;

Iterates over all values in tree found using the given key. This is useful when cmp implements a left prefix of the comparator used to order tree.

const cmp = (a, b) => a < b ? -1 : (a > b ? 1 : 0);

// The `.value` comparison is a left prefix of the tree comparator:
function compareValueAndIndex(a, b) {
  return cmp(a.value, b.value) || cmp(a.index, b.index);
}

let node = tree.fromDistinctAscArray([
  {value: 'A', index: 1},
  {value: 'B', index: 1},
  {value: 'C', index: 1},
]);
node = tree.insert(node, {value: 'B', index: 2}, compareValueAndIndex);
node = tree.insert(node, {value: 'B', index: 3}, compareValueAndIndex);

Array.from(
  tree.findAll(node, 'B', (key, obj) => cmp(key, obj.value)),
); /* => Returns:
[
  {value: 'B', index: 1},
  {value: 'B', index: 2},
  {value: 'B', index: 3},
] */

findBy()

function findBy<T, D = T>(
  tree: ImmutableTree<T>,
  cmp: (treeValue: T) => number,
  defaultValue: D,
): T | D;

Finds a value in tree and returns it, or defaultValue if not found.

cmp receives a value of type T from tree as its first argument. This allows you to pass in a closure that compares treeValue against a static key.

findNext()

function findNext<T, K = T, D = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
  defaultValue: D,
): T | D;

Finds a value in tree using the given key and returns the value immediately after it, or defaultValue if there is no such value.

key does not have to be found in the tree: if a set has 1 & 3, the next value from 2 is 3.

findNode()

function findNode<T, K = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
): ImmutableTree<T> | null;

This is similar to find(), but returns the entire tree node rather than just the value.

findPrev()

function findPrev<T, K = T, D = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
  defaultValue: D,
): T | D;

Finds a value in tree using the given key and returns the value immediately before it, or defaultValue if there is no such value.

key does not have to be found in the tree: if a set has 1 & 3, the previous value from 2 is 1.

findWithIndex()

function findWithIndex<T, K = T, D = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
  defaultValue: D,
): [value: T | D, index: number];

Combines find() and indexOf(). Finds a value and its position in tree using the given key, and returns them as a tuple. Returns [defaultValue, -1] if not found.

validate()

function validate<T>(
  tree: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): ValidateResult<T>;

Returns a ValidateResult<T> indicating whether the given tree is valid for the comparator cmp: all left subtree values are less than the parent value, and all right subtrees values are greater than the parent value.

indexOf()

function indexOf<T, K = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
): number;

Returns the position of key in tree, or -1 if not found.

at()

function at<T, D = T>(
  tree: ImmutableTree<T>,
  index: number,
  defaultValue?: D,
): T | D;

Returns the value positioned at (0-based) index in tree. Negative indices retrieve values from the end.

This is equivalent to toArray(tree).at(index), but doesn't create an intermediary array, and locates index in O(log n).

An out-of-bounds index will return defaultValue (or undefined if not specified).

iterate()

function iterate<T>(
  tree: ImmutableTree<T>,
  start?: number,
  end?: number,
): Generator<T, void, void>;

Returns a JS iterator that traverses the values of the tree in order, optionally starting from index start (inclusive) and ending at index end (exclusive).

Negative indices can be used for start and end:

const tree = fromDistinctAscArray([1, 2, 3, 4, 5]);
Array.from(iterate(tree, -4, -1));
// [ 2, 3, 4 ]

reverseIterate()

function reverseIterate<T>(
  tree: ImmutableTree<T>,
  start?: number,
  end?: number,
): Generator<T, void, void>;

Returns a JS iterator that traverses the values of the tree in reverse order.

Don't be confused by start and end: they're the same as for iterate. Think of it as taking a normal slice from start to end, and then iterating that slice in reverse (so iteration begins at end - 1, got it?).

const tree = fromDistinctAscArray([1, 2, 3, 4, 5]);

// both 2, 3, 4:
iterate(tree, 1, 4);
iterate(tree, -4, -1);

// both 4, 3, 2:
reverseIterate(tree, 1, 4);
reverseIterate(tree, -4, -1);

map()

function map<T, U>(tree: ImmutableTree<T>, mapper: (T) => U): ImmutableTree<U>;

Returns a new tree with every value passed through mapper. The mapped values are assumed to have the same relative order as before.

const numberTree = fromDistinctAscArray([1, 2, 3]);

const stringTree = map<number, string>(
  numberTree,
  (num: number) => String(num),
);

minNode()

function minNode<T>(tree: ImmutableTree<T>): NonEmptyImmutableTree<T>;

Returns the "smallest" (left-most) node in tree.

minValue()

function minValue<T>(tree: ImmutableTree<T>): T;

Returns the "smallest" (left-most) value in tree.

This is equivalent to minNode(tree).value.

maxNode()

function maxNode<T>(tree: ImmutableTree<T>): NonEmptyImmutableTree<T>;

Returns the "largest" (right-most) node in tree.

maxValue()

function maxValue<T>(tree: ImmutableTree<T>): T;

Returns the "largest" (right-most) value in tree.

This is equivalent to maxNode(tree).value.

setIndex()

function setIndex<T>(
  tree: ImmutableTree<T>,
  index: number,
  value: T,
): NonEmptyImmutableTree<T>;

Returns a new version of tree with the value at position index replaced by value.

Negative indices can be used to update values from the end of the tree. An out-of-bounds index is a no-op and just returns tree.

If you replace a value with one that has a different relative order, the tree will become invalid. However, if you're using the tree as a list, where the index itself is the ordering, then this obviously isn't a concern.

If value is identical to the existing value at index (according to Object.is), the same tree reference is returned back.

Time complexity: O(log n).

updateIndex()

function updateIndex<T>(
  tree: ImmutableTree<T>,
  index: number,
  updater: (existingTreeValue: T) => T,
): ImmutableTree<T>;

This is the same as setIndex(tree, index, updater(at(tree, index))), but avoids having to walk the tree twice.

slice()

function slice<T>(
  tree: ImmutableTree<T>,
  start?: number,
  end?: number,
): ImmutableTree<T>;

Has the same arguments as Array.prototype.slice.

Example:

const tree = fromDistinctAscArray([1, 2, 3, 4, 5]);
const newTree = slice(tree, 1, 3);
// newTree: [2, 3]

Time complexity: O(log n).

splice()

function splice<T>(
  tree: ImmutableTree<T>,
  start: number,
  deleteCount: number,
  items: ImmutableTree<T>
): {
  readonly tree: ImmutableTree<T>,
  readonly deleted: ImmutableTree<T>,
};

Has the same arguments as Array.prototype.splice.

Example:

const tree = fromDistinctAscArray([1, 2, 3, 4, 5]);
const {tree: newTree, deleted} = splice(tree, 1, 2, fromDistinctAscArray([8, 9]));
// newTree: [1, 8, 9, 4, 5]
// deleted: [2, 3]

Time complexity: O(log n + m), where n is the size of tree and m is the size of items.

split()

function split<T, K = T>(
  tree: ImmutableTree<T>,
  key: K,
  cmp: (a: K, b: T) => number,
): [
  small: ImmutableTree<T>,
  equal: ImmutableTree<T>,
  large: ImmutableTree<T>,
];

Splits a tree into two: one containing values smaller than key, and one containing values large than key, according to cmp.

If key exists in the tree, equal will reference the node in tree containing key; otherwise it will equal the empty tree.

splitIndex()

function splitIndex<T>(
  tree: ImmutableTree<T>,
  index: number,
): [
  small: ImmutableTree<T>,
  equal: ImmutableTree<T>,
  large: ImmutableTree<T>,
];

This is similar to split(), but uses the position in the tree rather than a key and comparator. Negative indices are not supported.

splitLast()

function splitLast<T>(tree: NonEmptyImmutableTree<T>): {
  readonly tree: ImmutableTree<T>;
  readonly value: T;
};

Removes the last value from a non-empty tree, and returns an object containing that value and the remaining tree.

Example:

const node = fromDistinctAscArray([1, 2, 3]);
const {tree: newNode, value} = splitLast(node);
// newNode: [1, 2]
// value: 3

Time complexity: O(log n).

splitFirst()

function splitFirst<T>(tree: NonEmptyImmutableTree<T>): {
  readonly tree: ImmutableTree<T>;
  readonly value: T;
};

Removes the first value from a non-empty tree, and returns an object containing that value and the remaining tree.

Example:

const node = fromDistinctAscArray([1, 2, 3]);
const {tree: newNode, value} = splitFirst(node);
// newNode: [2, 3]
// value: 1

Time complexity: O(log n).

toArray()

function toArray<T>(
  tree: ImmutableTree<T>,
): Array<T>;

Flattens tree into an array of values.

union()

function union<T>(
  t1: ImmutableTree<T>,
  t2: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
  combiner?: (v1: T, v2: T) => T,
): ImmutableTree<T>;

Merges two trees together using the comparator cmp.

combiner handles the case where an equivalent value appears in both trees, and is expected to return the final value to use in the union. If not specified, by union will prefer values in t1. If you return a different value, then the relative sort order must be preserved; otherwise ValueOrderError is thrown.

difference()

function difference<T>(
  t1: ImmutableTree<T>,
  t2: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): ImmutableTree<T>;

Returns a new tree with values in t1 that aren't in t2, using the comparator cmp.

symmetricDifference()

function symmetricDifference<T>(
  t1: ImmutableTree<T>,
  t2: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): ImmutableTree<T>;

Returns a new tree with values that are in either t1 or t2, but not in both, using the comparator cmp.

intersection()

function intersection<T>(
  t1: ImmutableTree<T>,
  t2: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
  combiner?: (v1: T, v2: T) => T,
): ImmutableTree<T>;

Returns a new tree with values common to both t1 and t2, using the comparator cmp.

combiner determines which value is chosen; by default it returns the value from the first tree, t1. If you return a different value, then the relative sort order must be preserved; otherwise ValueOrderError is thrown.

isDisjointFrom()

function isDisjointFrom<T>(
  tree: ImmutableTree<T>,
  other: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): boolean;

Returns true if tree and other have no values in common, or false otherwise.

isSubsetOf()

function isSubsetOf<T>(
  tree: ImmutableTree<T>,
  other: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): boolean;

Returns true if all values in tree are in other, or false otherwise.

isSupersetOf()

function isSupersetOf<T>(
  tree: ImmutableTree<T>,
  other: ImmutableTree<T>,
  cmp: (a: T, b: T) => number,
): boolean;

Returns true if all values in other are in tree, or false otherwise.

join()

function join<T>(
  left: ImmutableTree<T>,
  value: T,
  right: ImmutableTree<T>
): NonEmptyImmutableTree<T>;

Implements the join operation.

This is a low-level operation, and the resulting tree is not checked for validity.

join2()

function join2<T>(
  left: ImmutableTree<T>,
  right: ImmutableTree<T>
): ImmutableTree<T>;

Implements the join2 operation.

This is a low-level operation, and the resulting tree is not checked for validity.

zip()

function zip<T, U>(
  t1: ImmutableTree<T>,
  t2: ImmutableTree<U>,
): Generator<[T | void, U | void], void, void>;

Zips two trees together, returning an iterable of tuples: the first tuple contains the first values of both trees, the second tuple contains the second values of both trees, and so on. If the trees are of different sizes, undefined is used within a tuple where a corresponding value is missing.

Performance

Performance will largely depend on the size of your data and the cost of your comparator function. See the benchmark/ folder for comparisons against Immutable.js and mori.

Tests

# Unit tests
./node_modules/.bin/c8 node --test test/index.js

# Monkey tests
node --test test/monkey.js

# TypeScript tests
./node_modules/.bin/tsd

Changelog

See CHANGELOG.md.

References

  1. Adams, Stephen. "Implementing Sets Efficiently in a Functional Language." University of Southampton, n.d. Accessed at http://groups.csail.mit.edu/mac/users/adams/BB/92-10.ps

  2. GHC's Data.Map.Internal.

  3. Join-based tree algorithms

About

A persistent weight-balanced (bounded balance) tree.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published