Skip to content

BTreeMap cursors #141

Closed
Closed
@Amanieu

Description

@Amanieu

Proposal

Problem statement

One of the fundamental properties of BTreeMap is that it maintains elements in sorted order and enables efficient element lookup in O(log(N)) time. However the current API is overly fitted towards a key-value API like a HashMap and fails to expose the ability to make queries about "nearby" keys. For example, finding the first element whose key is greater than X.

There is limited support for this with the range API, but this is hard to use and doesn't allow mutating the tree (by inserting/deleting elements) while iterating. This is insufficient to address existing use cases.

Motivation, use-cases

One specific use case where such manipulations are useful is when using a BTreeMap to represent a set of non-overlapping ranges with associated values. This is often used to associate metadata with ranges of memory addresses.

As a proof of concept, I implemented a RangeTree type which conceptually holds a map of Range<K> -> V. It has 3 main operations:

  • get: returns the range containing the given key.
  • remove: removes any ranges that intersect the given range of keys. Partially overlapping ranges are truncated or split into 2 sub-ranges.
  • insert: removes any intersecting ranges (with the same behavior as remove for partially overlapping ranges) and then inserts a new range.

There are two implementations of these operations: one with just the standard library API and one with the new proposed cursor API.

Benchmark results show a 25% to 50% speedup by using cursors:

insert non-overlapping  time:   [82.516 ns 88.146 ns 93.282 ns]
                        change: [-33.810% -27.769% -20.292%] (p = 0.00 < 0.05)
                        Performance has improved.

insert overlapping end  time:   [76.902 ns 80.527 ns 83.670 ns]
                        change: [-52.589% -49.404% -46.189%] (p = 0.00 < 0.05)
                        Performance has improved.

insert overlapping start
                        time:   [73.223 ns 77.361 ns 81.151 ns]
                        change: [-32.392% -26.673% -20.424%] (p = 0.00 < 0.05)
                        Performance has improved.

insert overlapping all  time:   [187.08 ns 193.34 ns 198.79 ns]
                        change: [-29.838% -26.102% -22.680%] (p = 0.00 < 0.05)
                        Performance has improved.

insert within existing  time:   [75.748 ns 79.829 ns 83.380 ns]
                        change: [-32.192% -26.014% -19.129%] (p = 0.00 < 0.05)
                        Performance has improved.

remove non-overlapping  time:   [43.850 ns 46.363 ns 48.454 ns]
                        change: [-48.288% -43.011% -36.968%] (p = 0.00 < 0.05)
                        Performance has improved.

remove overlapping end  time:   [60.592 ns 64.339 ns 67.775 ns]
                        change: [-59.367% -56.141% -52.421%] (p = 0.00 < 0.05)
                        Performance has improved.

remove overlapping start
                        time:   [45.273 ns 48.281 ns 50.901 ns]
                        change: [-49.392% -43.853% -37.379%] (p = 0.00 < 0.05)
                        Performance has improved.

remove overlapping all  time:   [182.28 ns 189.05 ns 194.94 ns]
                        change: [-37.365% -34.433% -31.290%] (p = 0.00 < 0.05)
                        Performance has improved.

remove within existing  time:   [39.387 ns 41.828 ns 43.930 ns]
                        change: [-46.469% -41.060% -34.474%] (p = 0.00 < 0.05)
                        Performance has improved.

Solution sketches

This proposal adds Cursor and CursorMut types to BTreeMap based on similar types for LinkedList.

A Cursor is like an iterator, except that it can freely seek back-and-forth, and can safely mutate the tree during iteration. This is because the lifetime of its yielded references is tied to its own lifetime, instead of just the underlying tree. A cursor either points to an element in the tree, or to a "ghost" non-element that is logically located after the last element and before the first element.

This proposal adds the following APIs to BTreeMap:

impl<K, V> BTreeMap<K, V> {
    /// Returns a cursor pointing to the first element that is above the given bound.
    ///
    /// Passing `Unbounded` will return a cursor pointing to the first element in the tree.
    fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;

    /// Returns a mutable cursor pointing to the first element that is above the given bound.
    ///
    /// Passing `Unbounded` will return a cursor pointing to the first element in the tree.
    fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;

    /// Returns a cursor pointing to the last element that is below the given bound.
    ///
    /// Passing `Unbounded` will return a cursor pointing to the last element in the tree.
    fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;

    /// Returns a mutable cursor pointing to the last element that is below the given bound.
    ///
    /// Passing `Unbounded` will return a cursor pointing to the last element in the tree.
    fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
    where
        K: Borrow<Q> + Ord,
        Q: Ord;
}

/// An immutable cursor which only allows read-only operations.
struct Cursor<'a, K: 'a, V: 'a>;

impl<'a, K, V> Cursor<'a, K, V> {
    /// Moving the cursor to the next/previous element.
    fn move_next(&mut self);
    fn move_prev(&mut self);

    /// Access to the key and value of the current element.
    ///
    /// Returns `None` if the cursor points to the "ghost" non-element.
    fn key(&self) -> Option<&'a K>;
    fn value(&self) -> Option<&'a V>;
    fn key_value(&self) -> Option<(&'a K, &'a V)>;

    /// Read-only access to adjacent elements.
    fn peek_next(&self) -> Option<(&'a K, &'a V)>;
    fn peek_prev(&self) -> Option<(&'a K, &'a V)>;
}

/// An mutable cursor which allows mutating the tree.
struct CursorMut<'a, K: 'a, V: 'a>;

impl<'a, K, V> CursorMut<'a, K, V> {
    /// Moving the cursor to the next/previous element.
    fn move_next(&mut self);
    fn move_prev(&mut self);

    /// Access to the key and value of the current element.
    ///
    /// Returns `None` if the cursor points to the "ghost" non-element.
    fn key(&self) -> Option<&K>;
    fn value(&self) -> Option<&V>;
    fn value_mut(&mut self) -> Option<&mut V>;
    fn key_value(&self) -> Option<(&K, &V)>;
    fn key_value_mut(&self) -> Option<(&K, &mut V)>;

    /// Returns a mutable reference to the of the element that the cursor is
    /// currently pointing to.
    ///
    /// This can be used to modify the key, but you must ensure that the
    /// `BTreeMap` invariants are maintained. Specifically:
    ///
    /// * The key must remain unique within the tree.
    /// * The key must remain in sorted order with regards to other elements in
    ///   the tree.
    unsafe fn key_mut_unchecked(&mut self) -> Option<&mut K>;

    /// Read-only access to adjacent elements.
    fn peek_next(&self) -> Option<(&K, &V)>;
    fn peek_prev(&self) -> Option<(&K, &V)>;

    /// Returns a read-only cursor pointing to the current element.
    ///
    /// The lifetime of the returned `Cursor` is bound to that of the
    /// `CursorMut`, which means it cannot outlive the `CursorMut` and that the
    /// `CursorMut` is frozen for the lifetime of the `Cursor`.
    fn as_cursor(&self) -> Cursor<'_, K, V>;

    /// Inserts a new element into the `BTreeMap` after the current one.
    ///
    /// If the cursor is pointing at the "ghost" non-element then the new element is
    /// inserted at the front of the `BTreeMap`.
    ///
    /// # Panics
    ///
    /// This function panics if:
    /// - the given key compares less than or equal to the current element (if
    ///   any).
    /// - the given key compares greater than or equal to the next element (if
    ///   any).
    fn insert_after(&mut self, key: K, value: V);

    /// Inserts a new element into the `BTreeMap` before the current one.
    ///
    /// If the cursor is pointing at the "ghost" non-element then the new element is
    /// inserted at the end of the `BTreeMap`.
    ///
    /// # Panics
    ///
    /// This function panics if:
    /// - the given key compares greater than or equal to the current element
    ///   (if any).
    /// - the given key compares less than or equal to the previous element (if
    ///   any).
    fn insert_before(&mut self, key: K, value: V);

    /// Inserts a new element into the `BTreeMap` after the current one.
    ///
    /// If the cursor is pointing at the "ghost" non-element then the new element is
    /// inserted at the front of the `BTreeMap`.
    ///
    /// # Safety
    ///
    /// You must ensure that the `BTreeMap` invariants are maintained.
    /// Specifically:
    ///
    /// * The key of the newly inserted element must be unique in the tree.
    /// * All keys in the tree must remain in sorted order.
    unsafe fn insert_after_unchecked(&mut self, key: K, value: V);

    /// Inserts a new element into the `BTreeMap` before the current one.
    ///
    /// If the cursor is pointing at the "ghost" non-element then the new element is
    /// inserted at the end of the `BTreeMap`.
    ///
    /// # Safety
    ///
    /// You must ensure that the `BTreeMap` invariants are maintained.
    /// Specifically:
    ///
    /// * The key of the newly inserted element must be unique in the tree.
    /// * All keys in the tree must remain in sorted order.
    unsafe fn insert_before_unchecked(&mut self, key: K, value: V);

    /// Removes the current element from the `BTreeMap`.
    ///
    /// The element that was removed is returned, and the cursor is
    /// moved to point to the next element in the `BTreeMap`.
    ///
    /// If the cursor is currently pointing to the "ghost" non-element then no element
    /// is removed and `None` is returned. The cursor is not moved in this case.
    fn remove_current(&mut self) -> Option<(K, V)>;

    /// Removes the current element from the `BTreeMap`.
    ///
    /// The element that was removed is returned, and the cursor is
    /// moved to point to the previous element in the `BTreeMap`.
    ///
    /// If the cursor is currently pointing to the "ghost" non-element then no element
    /// is removed and `None` is returned. The cursor is not moved in this case.
    fn remove_current_and_move_back(&mut self) -> Option<(K, V)>;
}

Links and related work

What happens now?

This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ACP-acceptedAPI Change Proposal is accepted (seconded with no objections)T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions