Description
Proposal
Problem statement
One of the fundamental properties of BTreeMap
is that it maintains elements in sorted order and enables efficient element lookup in O(log(N))
time. However the current API is overly fitted towards a key-value API like a HashMap
and fails to expose the ability to make queries about "nearby" keys. For example, finding the first element whose key is greater than X.
There is limited support for this with the range
API, but this is hard to use and doesn't allow mutating the tree (by inserting/deleting elements) while iterating. This is insufficient to address existing use cases.
Motivation, use-cases
One specific use case where such manipulations are useful is when using a BTreeMap
to represent a set of non-overlapping ranges with associated values. This is often used to associate metadata with ranges of memory addresses.
As a proof of concept, I implemented a RangeTree
type which conceptually holds a map of Range<K> -> V
. It has 3 main operations:
get
: returns the range containing the given key.remove
: removes any ranges that intersect the given range of keys. Partially overlapping ranges are truncated or split into 2 sub-ranges.insert
: removes any intersecting ranges (with the same behavior asremove
for partially overlapping ranges) and then inserts a new range.
There are two implementations of these operations: one with just the standard library API and one with the new proposed cursor API.
Benchmark results show a 25% to 50% speedup by using cursors:
insert non-overlapping time: [82.516 ns 88.146 ns 93.282 ns]
change: [-33.810% -27.769% -20.292%] (p = 0.00 < 0.05)
Performance has improved.
insert overlapping end time: [76.902 ns 80.527 ns 83.670 ns]
change: [-52.589% -49.404% -46.189%] (p = 0.00 < 0.05)
Performance has improved.
insert overlapping start
time: [73.223 ns 77.361 ns 81.151 ns]
change: [-32.392% -26.673% -20.424%] (p = 0.00 < 0.05)
Performance has improved.
insert overlapping all time: [187.08 ns 193.34 ns 198.79 ns]
change: [-29.838% -26.102% -22.680%] (p = 0.00 < 0.05)
Performance has improved.
insert within existing time: [75.748 ns 79.829 ns 83.380 ns]
change: [-32.192% -26.014% -19.129%] (p = 0.00 < 0.05)
Performance has improved.
remove non-overlapping time: [43.850 ns 46.363 ns 48.454 ns]
change: [-48.288% -43.011% -36.968%] (p = 0.00 < 0.05)
Performance has improved.
remove overlapping end time: [60.592 ns 64.339 ns 67.775 ns]
change: [-59.367% -56.141% -52.421%] (p = 0.00 < 0.05)
Performance has improved.
remove overlapping start
time: [45.273 ns 48.281 ns 50.901 ns]
change: [-49.392% -43.853% -37.379%] (p = 0.00 < 0.05)
Performance has improved.
remove overlapping all time: [182.28 ns 189.05 ns 194.94 ns]
change: [-37.365% -34.433% -31.290%] (p = 0.00 < 0.05)
Performance has improved.
remove within existing time: [39.387 ns 41.828 ns 43.930 ns]
change: [-46.469% -41.060% -34.474%] (p = 0.00 < 0.05)
Performance has improved.
Solution sketches
This proposal adds Cursor
and CursorMut
types to BTreeMap
based on similar types for LinkedList
.
A Cursor is like an iterator, except that it can freely seek back-and-forth, and can safely mutate the tree during iteration. This is because the lifetime of its yielded references is tied to its own lifetime, instead of just the underlying tree. A cursor either points to an element in the tree, or to a "ghost" non-element that is logically located after the last element and before the first element.
This proposal adds the following APIs to BTreeMap
:
impl<K, V> BTreeMap<K, V> {
/// Returns a cursor pointing to the first element that is above the given bound.
///
/// Passing `Unbounded` will return a cursor pointing to the first element in the tree.
fn lower_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
where
K: Borrow<Q> + Ord,
Q: Ord;
/// Returns a mutable cursor pointing to the first element that is above the given bound.
///
/// Passing `Unbounded` will return a cursor pointing to the first element in the tree.
fn lower_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
where
K: Borrow<Q> + Ord,
Q: Ord;
/// Returns a cursor pointing to the last element that is below the given bound.
///
/// Passing `Unbounded` will return a cursor pointing to the last element in the tree.
fn upper_bound<Q>(&self, bound: Bound<&Q>) -> Cursor<'_, K, V>
where
K: Borrow<Q> + Ord,
Q: Ord;
/// Returns a mutable cursor pointing to the last element that is below the given bound.
///
/// Passing `Unbounded` will return a cursor pointing to the last element in the tree.
fn upper_bound_mut<Q>(&mut self, bound: Bound<&Q>) -> CursorMut<'_, K, V>
where
K: Borrow<Q> + Ord,
Q: Ord;
}
/// An immutable cursor which only allows read-only operations.
struct Cursor<'a, K: 'a, V: 'a>;
impl<'a, K, V> Cursor<'a, K, V> {
/// Moving the cursor to the next/previous element.
fn move_next(&mut self);
fn move_prev(&mut self);
/// Access to the key and value of the current element.
///
/// Returns `None` if the cursor points to the "ghost" non-element.
fn key(&self) -> Option<&'a K>;
fn value(&self) -> Option<&'a V>;
fn key_value(&self) -> Option<(&'a K, &'a V)>;
/// Read-only access to adjacent elements.
fn peek_next(&self) -> Option<(&'a K, &'a V)>;
fn peek_prev(&self) -> Option<(&'a K, &'a V)>;
}
/// An mutable cursor which allows mutating the tree.
struct CursorMut<'a, K: 'a, V: 'a>;
impl<'a, K, V> CursorMut<'a, K, V> {
/// Moving the cursor to the next/previous element.
fn move_next(&mut self);
fn move_prev(&mut self);
/// Access to the key and value of the current element.
///
/// Returns `None` if the cursor points to the "ghost" non-element.
fn key(&self) -> Option<&K>;
fn value(&self) -> Option<&V>;
fn value_mut(&mut self) -> Option<&mut V>;
fn key_value(&self) -> Option<(&K, &V)>;
fn key_value_mut(&self) -> Option<(&K, &mut V)>;
/// Returns a mutable reference to the of the element that the cursor is
/// currently pointing to.
///
/// This can be used to modify the key, but you must ensure that the
/// `BTreeMap` invariants are maintained. Specifically:
///
/// * The key must remain unique within the tree.
/// * The key must remain in sorted order with regards to other elements in
/// the tree.
unsafe fn key_mut_unchecked(&mut self) -> Option<&mut K>;
/// Read-only access to adjacent elements.
fn peek_next(&self) -> Option<(&K, &V)>;
fn peek_prev(&self) -> Option<(&K, &V)>;
/// Returns a read-only cursor pointing to the current element.
///
/// The lifetime of the returned `Cursor` is bound to that of the
/// `CursorMut`, which means it cannot outlive the `CursorMut` and that the
/// `CursorMut` is frozen for the lifetime of the `Cursor`.
fn as_cursor(&self) -> Cursor<'_, K, V>;
/// Inserts a new element into the `BTreeMap` after the current one.
///
/// If the cursor is pointing at the "ghost" non-element then the new element is
/// inserted at the front of the `BTreeMap`.
///
/// # Panics
///
/// This function panics if:
/// - the given key compares less than or equal to the current element (if
/// any).
/// - the given key compares greater than or equal to the next element (if
/// any).
fn insert_after(&mut self, key: K, value: V);
/// Inserts a new element into the `BTreeMap` before the current one.
///
/// If the cursor is pointing at the "ghost" non-element then the new element is
/// inserted at the end of the `BTreeMap`.
///
/// # Panics
///
/// This function panics if:
/// - the given key compares greater than or equal to the current element
/// (if any).
/// - the given key compares less than or equal to the previous element (if
/// any).
fn insert_before(&mut self, key: K, value: V);
/// Inserts a new element into the `BTreeMap` after the current one.
///
/// If the cursor is pointing at the "ghost" non-element then the new element is
/// inserted at the front of the `BTreeMap`.
///
/// # Safety
///
/// You must ensure that the `BTreeMap` invariants are maintained.
/// Specifically:
///
/// * The key of the newly inserted element must be unique in the tree.
/// * All keys in the tree must remain in sorted order.
unsafe fn insert_after_unchecked(&mut self, key: K, value: V);
/// Inserts a new element into the `BTreeMap` before the current one.
///
/// If the cursor is pointing at the "ghost" non-element then the new element is
/// inserted at the end of the `BTreeMap`.
///
/// # Safety
///
/// You must ensure that the `BTreeMap` invariants are maintained.
/// Specifically:
///
/// * The key of the newly inserted element must be unique in the tree.
/// * All keys in the tree must remain in sorted order.
unsafe fn insert_before_unchecked(&mut self, key: K, value: V);
/// Removes the current element from the `BTreeMap`.
///
/// The element that was removed is returned, and the cursor is
/// moved to point to the next element in the `BTreeMap`.
///
/// If the cursor is currently pointing to the "ghost" non-element then no element
/// is removed and `None` is returned. The cursor is not moved in this case.
fn remove_current(&mut self) -> Option<(K, V)>;
/// Removes the current element from the `BTreeMap`.
///
/// The element that was removed is returned, and the cursor is
/// moved to point to the previous element in the `BTreeMap`.
///
/// If the cursor is currently pointing to the "ghost" non-element then no element
/// is removed and `None` is returned. The cursor is not moved in this case.
fn remove_current_and_move_back(&mut self) -> Option<(K, V)>;
}
Links and related work
LinkedList
cursor RFC and tracking issue.
What happens now?
This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.