`BTreeMap::merge` optimized by asder8215 · Pull Request #152418 · rust-lang/rust

asder8215 · 2026-02-10T02:57:20Z

This is an optimized version of #151981. See ACP for more information on BTreeMap::merge does.

CC @programmerjake. Let me know what you think of how I'm using CursorMut and IntoIter here and whether the unsafe code here looks good. I decided to use ptr::read() and ptr::write() to grab the value from CursorMut as V than &mut V, use it within the conflict function, and overwrite the content of conflicting key afterward.

I know this needs some polishing, especially with refactoring some redundant looking code in a nicer way, some of which could probably just be public API methods for CursorMut. It does pass all the tests that I currently have for BTreeMap::merge (inspired from BTreeMap::append) though, so that's good.

programmerjake · 2026-02-10T08:27:42Z

library/alloc/src/collections/btree/map.rs

+                    // SAFETY: We read in self_val's and hand it over to our conflict function
+                    // which will always return a value that we can use to overwrite what's
+                    // in self_val
+                    unsafe {
+                        let val = ptr::read(self_val);
+                        let next_val = (conflict)(self_key, val, first_other_val);
+                        ptr::write(self_val, next_val);
+                    }


this should really be a method of Cursor*, you need a handler for when conflict panics that removes the entry without dropping whatever you ptr::read from.
something like:

impl<K, V> CursorMut<'a, K, V> { /// call `f` with the next entry's key and value, replacing the next entry's value with the returned value. if `f` unwinds, the next entry is removed. /// equivalent to a more efficient version of: /// ```rust /// if let Some((k, v)) = self.remove_next() { /// let v = f(&k, v); /// // Safety: key is unmodified /// unsafe { self.insert_after_unchecked(k, v) }; /// } /// ``` pub(super) fn with_next(&mut self, f: impl FnOnce(&K, V) -> V) { struct RemoveNextOnDrop<'a, 'b, K, V> { cursor: &'a mut CursorMut<'b, K, V>, forget_next: bool, } impl<K, V> Drop for RemoveNextOnDrop<'_, '_, K, V> { fn drop(&mut self) { if self.forget_next { // call an equivalent to CursorMut::remove_next() // except that instead of returning `V`, it never moves or drops it. self.0.forget_next_value(); } } } let mut remove_next_on_drop = RemoveNextOnDrop { cursor: self, forget_next: false, // we don't know that we have a next value yet }; if let Some((k, v_mut)) = remove_next_on_drop.cursor.peek_next() { remove_next_on_drop.forget_next = true; // Safety: we move the V out of the next entry, // we marked the entry's value to be forgotten // when remove_next_on_drop is dropped that // way we avoid returning to the caller leaving // a moved-out invalid value if `f` unwinds. let v = unsafe { std::ptr::read(v_mut) }; let v = f(k, v); // Safety: move the V back into the next entry unsafe { std::ptr::write(v_mut, v) }; remove_next_on_drop.forget_next = false; } } }

The equivalent CursorMutKey method should instead have f be impl FnOnce(K, V) -> (K, V) and needs to forget both the key and value since they were both ptr::read, and not just the value.

Got you 👍

library/alloc/src/collections/btree/map/tests.rs

asder8215 · 2026-02-10T16:09:57Z

Appreciate your patience and feedback in all of this @programmerjake!

asder8215 · 2026-02-11T01:37:00Z

@programmerjake Took your suggestion on with_next() on Cursor* and made and forget_next*() functions. Lmk what you think about this!

Also, I've been wondering if with_next() has uses to be a public accessible function?

library/alloc/src/collections/btree/map.rs

programmerjake · 2026-02-11T03:47:02Z

library/alloc/src/collections/btree/map.rs

+        let mut emptied_internal_root = false;
+        if let Ok(next_kv) = current.next_kv() {
+            let ((_, val), pos) =
+                next_kv.remove_kv_tracking(|| emptied_internal_root = true, self.alloc.clone());


remove_kv_tracking may panic and drop the key and value, tbh this is very complicated and makes me think it could be better for with_next() instead of calling forget_next[_key_and]_value, to transmute the cursor reference type from &mut CursorMut<K, V> to &mut CursorMut<K, ManuallyDrop<V>> and then just call remove_next. that said, idk if that would be sound, asking on Zulip: #t-libs > could we use transmute tricks in the impl of BTreeMap? @ 💬

True, I think transmute should be fine since &mut CursorMut<K, V> and &mut CursorMut<K, ManuallyDrop<V>> are the same size and it forgets the original, so we wouldn't be running the destructor on the original. (but of course I can't hold myself to that since I haven't played around with mem::transmute much myself)

it might differ with -Zrandomize-layout or other future struct layout changes. I haven't reviewed the whole BTreeMap implementation to see what layout guarantees it has.

rustbot · 2026-02-11T03:47:23Z

Reminder, once the PR becomes ready for a review, use @rustbot ready.

library/alloc/src/collections/btree/map.rs

asder8215 · 2026-02-12T21:42:53Z

@programmerjake Made the change, lmk what you think!

programmerjake

looks good enough, though I didn't really review the tests.

View changes since this review

library/alloc/src/collections/btree/map.rs

asder8215 · 2026-02-13T00:05:58Z

The tests I took verbatim from append, which seemed thorough enough in examining whether the edges are fixed after insertion, drop behavior, and checking if the key is not overwritten on conflict.

One thing I'll point out is that interestingly this append test:

#[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_append_drop_leak() {
    let a = CrashTestDummy::new(0);
    let b = CrashTestDummy::new(1);
    let c = CrashTestDummy::new(2);
    let mut left = BTreeMap::new();
    let mut right = BTreeMap::new();
    left.insert(a.spawn(Panic::Never), ());
    left.insert(b.spawn(Panic::Never), ());
    left.insert(c.spawn(Panic::Never), ());
    right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
    right.insert(c.spawn(Panic::Never), ());

    catch_unwind(move || left.append(&mut right)).unwrap_err();
    assert_eq!(a.dropped(), 1);
    assert_eq!(b.dropped(), 1); // should be 2 were it not for Rust issue #47949
    assert_eq!(c.dropped(), 2);
}

When I ported a similar version of this for merge:

#[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_merge_drop_leak() {
    let a = CrashTestDummy::new(0);
    let b = CrashTestDummy::new(1);
    let c = CrashTestDummy::new(2);
    let mut left = BTreeMap::new();
    let mut right = BTreeMap::new();
    left.insert(a.spawn(Panic::Never), ());
    left.insert(b.spawn(Panic::Never), ());
    left.insert(c.spawn(Panic::Never), ());
    right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
    right.insert(c.spawn(Panic::Never), ());

    catch_unwind(move || left.merge(right, |_, _, _| ())).unwrap_err();
    assert_eq!(a.dropped(), 1);
    assert_eq!(b.dropped(), 2);
    assert_eq!(c.dropped(), 2);
}

The assert_eq!(b.dropped(), #) gave me 2 instead of 1 (which is expected behavior it seems).

library/alloc/src/collections/btree/map/tests.rs

asder8215 · 2026-02-14T01:42:05Z

I added a panic test within conflict, but I think due to #47949, it isn't incrementing the counter for b_val* and c_val* on drop. The left BTreeMap does have a length of 1 though.

rustbot · 2026-02-15T22:22:15Z

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: libs
libs expanded to 7 candidates
Random selection from Mark-Simulacrum, jhpratt, joboet

asder8215 · 2026-02-17T01:08:08Z

r? @Mark-Simulacrum

Mark was the reviewer of my previous BTreeMap merge PR (which uses double iterator) and I recall that jhpratt mentioned he's stepping off rotation for a bit last week on the previous BTreeMap PR (not sure if it's true right now), so I'm just rolling this review to Mark.

jhpratt · 2026-02-17T01:48:05Z

I put myself back on rotation, but I don't mind being reassigned :) It would've been a couple days before I could review anyways.

Mark-Simulacrum · 2026-02-22T23:36:17Z

library/alloc/src/collections/btree/map.rs

+    ///
+    /// If the cursor is at the end of the map then the function is not called
+    /// and this essentially does not do anything.
+    pub(super) fn with_next(&mut self, f: impl FnOnce(&K, V) -> V) {


I feel like with_next having mutating action on self is not particularly intuitive. Maybe we can call this something like replace_next? But also it seems like at least the current impl can just be inlined into the call site and we can remove this entirely?

I'm fine with renaming it to replace_next(). I also can see it being inlined as well due to how short the code is and we don't need that function call overhead.

I was hoping this would be re-implemented in the future such that it actually replaces the content of the next node (removing the entry on panic) rather than remove always and then insert. Is there a way to make a note of this optimization neatly?

I'd probably cut an issue with some text describing the proposed optimization (and why we haven't done it) and then add a FIXME comment in both(?) places we'd inline the implementation to. Easier to read right now and once the implementation is proposed it can refactor depending on how it looks exactly.

I created the issue here with this optimization (and other optimizations) to be resolved for BTreeMap::merge.

library/alloc/src/collections/btree/map.rs

Mark-Simulacrum · 2026-02-22T23:51:04Z

library/alloc/src/collections/btree/map.rs

+                    }
+                } else {
+                    // SAFETY: reaching here means our cursor is at the end
+                    // self BTreeMap so we just insert other_key here


It seems like this is also probably inefficient: we'd want to do a bulk build here, especially if we're merging in a much larger other map into self. I think this is similar to the linear search above on Ordering::Greater.

I think it's worth at least adding an unresolved question and/or expanding the merge function's doc comment to say something about merges with expected few-matches being better done with e.g. self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>(), given a sorted_zip that does the advancing of left/right (merge sort's merge effectively). I'm not sure what the constant factors are -- probably depends on how expensive allocation and moving the keys/values around is -- but it seems remiss not to mention it.

It seems like this is also probably inefficient: we'd want to do a bulk build here, especially if we're merging in a much larger other map into self.

I agree that bulk building has to occur in the else condition, where we know that our other map leftovers keys are greater than self map.

I think this is similar to the linear search above on Ordering::Greater.

Yea, this needs a method that acts similarly to BTreeMap::lower_bound_mut/BTreeMap::upper_bound_mut for Cursor*, so that we don't have multiple .next() calls and we can traverse through the underlying map structure efficiently in logarithmic time.

I think it's worth at least adding an unresolved question and/or expanding the merge function's doc comment to say something about merges with expected few-matches being better done with e.g. self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>(), given a sorted_zip that does the advancing of left/right (merge sort's merge effectively)

I'll leave a FIXME here noting how this could be improved with batching/bulk pushing the items (which could possibly be another method that Cursor* should be able to support?). My only hesitation with the example code self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>() is how collecting an iterator into a BTreeMap works. I don't imagine that it has knowledge that the iterator is sorted, so it would end up taking O(nlg(n)) since you have n elements from the sorted merged iterator and if it's traversing from root, it could take lg(n) time to find the right spot to insert the element. There is a bulk_build_from_sorted_iter() method that BTreeMap has, but that's private.

Yeah, right now it collects into a Vec and then sorts that:

rust/library/alloc/src/collections/btree/map.rs

Lines 2385 to 2393 in b3869b9

let mut inputs: Vec<_> = iter.into_iter().collect();

if inputs.is_empty() {

return BTreeMap::new();

}

// use stable sort to preserve the insertion order.

inputs.sort_by(|a, b| a.0.cmp(&b.0));

BTreeMap::bulk_build_from_sorted_iter(inputs, Global)

I suspect that implies that code shouldn't get written as-is exactly in most cases, but we are missing some kind of bulk construction primitive. I don't know how commonly that would actually matter -- I've had fairly few cases I'm doing anything with large BTrees personally, much less merging them -- so it's not clear we need/want a public API. On the other hand, maybe there's an API shape that avoids the merge operation being needed.

I think a bulk construction that could work if we're able to traverse through the BTreeMap directly and ascend and descend through the levels, inserting it into the appropriate places.

I think doing that would at least minimize the best case work of the inputs being sorted already to O(n), and in the avg/worst case it would take O(nlgn), since in the sorted case it'll take O(1) to find the spot to insert the item relative to the current node that we're looking at, and in the unsorted case we would need to traverse through the map and entries to find the right spot (which would be O(lgn)). I don't know if that would be faster, but maybe worth experimenting and benchmarking?

Mark-Simulacrum · 2026-02-24T02:28:50Z

If you can squash the commits a little (not sure how much useful is in the history, but seems like there's some that ought to get removed), r=me

…o BTreeMap::append)

…s on bulk inserting other_keys into self map, and inlined with_next() insertion on conflicts from Cursor*

asder8215 · 2026-02-24T08:22:37Z

@Mark-Simulacrum squashed the commits into 3 separate commits. Lmk if there's anything else I need to do!

Mark-Simulacrum

@bors r+

View changes since this review

rust-bors · 2026-02-25T03:19:38Z

📌 Commit cbdcfca has been approved by Mark-Simulacrum

It is now in the queue for this repository.

Rollup of 12 pull requests Successful merges: - #149169 (ptr::replace: make calls on ZST null ptr not UB) - #150562 (Fix doc link used in suggestion for pinning self) - #152418 (`BTreeMap::merge` optimized) - #152679 (rustc_expand: improve diagnostics for non-repeatable metavars) - #152952 (mGCA: improve ogca diagnostic message ) - #152977 (Fix relative path handling for --extern-html-root-url) - #153017 (Implement debuginfo for unsafe binder types) - #152868 (delete some very old trivial `Box` tests) - #152922 (rustc_public: Make fields that shouldn't be exposed visible only in `rustc_public`) - #153032 (Fix attribute parser and kind names.) - #153051 (Migration of `LintDiagnostic` - part 3) - #153060 (Give a better error when updating a submodule fails)

@programmerjake

Rollup merge of #152418 - asder8215:btreemap_merge_optimized, r=Mark-Simulacrum `BTreeMap::merge` optimized This is an optimized version of #151981. See [ACP](rust-lang/libs-team#739 (comment)) for more information on `BTreeMap::merge` does. CC @programmerjake. Let me know what you think of how I'm using `CursorMut` and `IntoIter` here and whether the unsafe code here looks good. I decided to use `ptr::read()` and `ptr::write()` to grab the value from `CursorMut` as `V` than `&mut V`, use it within the `conflict` function, and overwrite the content of conflicting key afterward. I know this needs some polishing, especially with refactoring some redundant looking code in a nicer way, some of which could probably just be public API methods for `CursorMut`. It does pass all the tests that I currently have for `BTreeMap::merge` (inspired from `BTreeMap::append`) though, so that's good.

Kobzol · 2026-02-25T11:23:28Z

@rust-timer build 73143b0

For #153074.

rust-timer · 2026-02-25T11:23:31Z

Missing artifact for sha 73143b06c7c1a162a7e701b4e0863f3b2179f725 (https://ci-artifacts.rust-lang.org/rustc-builds/73143b06c7c1a162a7e701b4e0863f3b2179f725/rustc-nightly-x86_64-unknown-linux-gnu.tar.xz); not built yet, try again later.

JonathanBrouwer · 2026-02-25T12:21:30Z

@rust-timer build 73143b0

rust-timer · 2026-02-25T12:21:33Z

Queued 73143b0 with parent 11ad63a, future comparison URL.
There are currently 3 preceding artifacts in the queue.
It will probably take at least ~3.6 hours until the benchmark run finishes.

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 10, 2026

asder8215 force-pushed the btreemap_merge_optimized branch from a5a105e to a7fa7dd Compare February 10, 2026 02:59

programmerjake reviewed Feb 10, 2026

View reviewed changes

library/alloc/src/collections/btree/map/tests.rs Show resolved Hide resolved

programmerjake reviewed Feb 10, 2026

View reviewed changes

library/alloc/src/collections/btree/map/tests.rs Show resolved Hide resolved

programmerjake requested changes Feb 11, 2026

View reviewed changes

asder8215 force-pushed the btreemap_merge_optimized branch from 6d826f5 to 97f5547 Compare February 11, 2026 04:25

programmerjake reviewed Feb 12, 2026

View reviewed changes

library/alloc/src/collections/btree/map.rs Outdated Show resolved Hide resolved

asder8215 force-pushed the btreemap_merge_optimized branch from b26e4e1 to 6c436c1 Compare February 12, 2026 21:48

asder8215 requested a review from programmerjake February 12, 2026 21:55

asder8215 force-pushed the btreemap_merge_optimized branch from 6c436c1 to 9b423c5 Compare February 12, 2026 21:56

programmerjake reviewed Feb 12, 2026

View reviewed changes

asder8215 requested a review from programmerjake February 13, 2026 00:10

programmerjake reviewed Feb 13, 2026

View reviewed changes

library/alloc/src/collections/btree/map/tests.rs Show resolved Hide resolved

asder8215 requested a review from programmerjake February 14, 2026 01:42

asder8215 mentioned this pull request Feb 14, 2026

implement BTreeMap::merge #151981

Closed

asder8215 marked this pull request as ready for review February 15, 2026 22:22

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 15, 2026

rustbot assigned jhpratt Feb 15, 2026

rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 15, 2026

asder8215 mentioned this pull request Feb 16, 2026

Tracking Issue for BTreeMap::merge #152152

Open

7 tasks

rustbot assigned Mark-Simulacrum and unassigned jhpratt Feb 17, 2026

Mark-Simulacrum requested changes Feb 22, 2026

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 22, 2026

asder8215 requested a review from Mark-Simulacrum February 23, 2026 05:40

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 23, 2026

asder8215 added 3 commits February 24, 2026 02:38

feat: BTreeMap::merge implemented with into iterators only (similar t…

1b50859

…o BTreeMap::append)

Optimized BTreeMap::merge using CursorMut

24efac1

Swapped key comparisons to match lower_bound_mut, added FIXME comment…

cbdcfca

…s on bulk inserting other_keys into self map, and inlined with_next() insertion on conflicts from Cursor*

asder8215 force-pushed the btreemap_merge_optimized branch from e6f0d43 to cbdcfca Compare February 24, 2026 07:44

asder8215 mentioned this pull request Feb 24, 2026

Optimizations to be performed on BTreeMap::merge #153040

Open

3 tasks

Mark-Simulacrum approved these changes Feb 25, 2026

View reviewed changes

rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 25, 2026

jhpratt mentioned this pull request Feb 25, 2026

Rollup of 12 pull requests #153074

Merged

rust-bors bot merged commit 5c47d0b into rust-lang:main Feb 25, 2026
11 checks passed

rustbot added this to the 1.95.0 milestone Feb 25, 2026

	let mut inputs: Vec<_> = iter.into_iter().collect();

	if inputs.is_empty() {
	return BTreeMap::new();
	}

	// use stable sort to preserve the insertion order.
	inputs.sort_by(\|a, b\| a.0.cmp(&b.0));
	BTreeMap::bulk_build_from_sorted_iter(inputs, Global)

Uh oh!

Comments

Conversation

asder8215 commented Feb 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asder8215 commented Feb 10, 2026

Uh oh!

asder8215 commented Feb 11, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asder8215 Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rustbot commented Feb 11, 2026

Uh oh!

Uh oh!

asder8215 commented Feb 12, 2026

Uh oh!

programmerjake left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asder8215 commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

asder8215 commented Feb 14, 2026

Uh oh!

rustbot commented Feb 15, 2026

Uh oh!

asder8215 commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhpratt commented Feb 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asder8215 Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asder8215 Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mark-Simulacrum commented Feb 24, 2026

Uh oh!

asder8215 commented Feb 24, 2026

Uh oh!

Mark-Simulacrum left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rust-bors bot commented Feb 25, 2026

asder8215 Feb 11, 2026 •

edited

Loading

programmerjake left a comment •

edited by rustbot

Loading

asder8215 commented Feb 13, 2026 •

edited

Loading

asder8215 commented Feb 17, 2026 •

edited

Loading

asder8215 Feb 24, 2026 •

edited

Loading

asder8215 Feb 24, 2026 •

edited

Loading

Mark-Simulacrum left a comment •

edited by rustbot

Loading