BTreeMap::merge optimized#152418
Conversation
a5a105e to
a7fa7dd
Compare
| // SAFETY: We read in self_val's and hand it over to our conflict function | ||
| // which will always return a value that we can use to overwrite what's | ||
| // in self_val | ||
| unsafe { | ||
| let val = ptr::read(self_val); | ||
| let next_val = (conflict)(self_key, val, first_other_val); | ||
| ptr::write(self_val, next_val); | ||
| } |
There was a problem hiding this comment.
this should really be a method of Cursor*, you need a handler for when conflict panics that removes the entry without dropping whatever you ptr::read from.
something like:
impl<K, V> CursorMut<'a, K, V> {
/// call `f` with the next entry's key and value, replacing the next entry's value with the returned value. if `f` unwinds, the next entry is removed.
/// equivalent to a more efficient version of:
/// ```rust
/// if let Some((k, v)) = self.remove_next() {
/// let v = f(&k, v);
/// // Safety: key is unmodified
/// unsafe { self.insert_after_unchecked(k, v) };
/// }
/// ```
pub(super) fn with_next(&mut self, f: impl FnOnce(&K, V) -> V) {
struct RemoveNextOnDrop<'a, 'b, K, V> {
cursor: &'a mut CursorMut<'b, K, V>,
forget_next: bool,
}
impl<K, V> Drop for RemoveNextOnDrop<'_, '_, K, V> {
fn drop(&mut self) {
if self.forget_next {
// call an equivalent to CursorMut::remove_next()
// except that instead of returning `V`, it never moves or drops it.
self.0.forget_next_value();
}
}
}
let mut remove_next_on_drop = RemoveNextOnDrop {
cursor: self,
forget_next: false, // we don't know that we have a next value yet
};
if let Some((k, v_mut)) = remove_next_on_drop.cursor.peek_next() {
remove_next_on_drop.forget_next = true;
// Safety: we move the V out of the next entry,
// we marked the entry's value to be forgotten
// when remove_next_on_drop is dropped that
// way we avoid returning to the caller leaving
// a moved-out invalid value if `f` unwinds.
let v = unsafe { std::ptr::read(v_mut) };
let v = f(k, v);
// Safety: move the V back into the next entry
unsafe { std::ptr::write(v_mut, v) };
remove_next_on_drop.forget_next = false;
}
}
}The equivalent CursorMutKey method should instead have f be impl FnOnce(K, V) -> (K, V) and needs to forget both the key and value since they were both ptr::read, and not just the value.
|
Appreciate your patience and feedback in all of this @programmerjake! |
|
@programmerjake Took your suggestion on Also, I've been wondering if |
| let mut emptied_internal_root = false; | ||
| if let Ok(next_kv) = current.next_kv() { | ||
| let ((_, val), pos) = | ||
| next_kv.remove_kv_tracking(|| emptied_internal_root = true, self.alloc.clone()); |
There was a problem hiding this comment.
remove_kv_tracking may panic and drop the key and value, tbh this is very complicated and makes me think it could be better for with_next() instead of calling forget_next[_key_and]_value, to transmute the cursor reference type from &mut CursorMut<K, V> to &mut CursorMut<K, ManuallyDrop<V>> and then just call remove_next. that said, idk if that would be sound, asking on Zulip: #t-libs > could we use transmute tricks in the impl of BTreeMap? @ 💬
There was a problem hiding this comment.
True, I think transmute should be fine since &mut CursorMut<K, V> and &mut CursorMut<K, ManuallyDrop<V>> are the same size and it forgets the original, so we wouldn't be running the destructor on the original. (but of course I can't hold myself to that since I haven't played around with mem::transmute much myself)
There was a problem hiding this comment.
it might differ with -Zrandomize-layout or other future struct layout changes. I haven't reviewed the whole BTreeMap implementation to see what layout guarantees it has.
|
Reminder, once the PR becomes ready for a review, use |
6d826f5 to
97f5547
Compare
|
@programmerjake Made the change, lmk what you think! |
b26e4e1 to
6c436c1
Compare
6c436c1 to
9b423c5
Compare
|
The tests I took verbatim from One thing I'll point out is that interestingly this #[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_append_drop_leak() {
let a = CrashTestDummy::new(0);
let b = CrashTestDummy::new(1);
let c = CrashTestDummy::new(2);
let mut left = BTreeMap::new();
let mut right = BTreeMap::new();
left.insert(a.spawn(Panic::Never), ());
left.insert(b.spawn(Panic::Never), ());
left.insert(c.spawn(Panic::Never), ());
right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
right.insert(c.spawn(Panic::Never), ());
catch_unwind(move || left.append(&mut right)).unwrap_err();
assert_eq!(a.dropped(), 1);
assert_eq!(b.dropped(), 1); // should be 2 were it not for Rust issue #47949
assert_eq!(c.dropped(), 2);
}When I ported a similar version of this for #[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_merge_drop_leak() {
let a = CrashTestDummy::new(0);
let b = CrashTestDummy::new(1);
let c = CrashTestDummy::new(2);
let mut left = BTreeMap::new();
let mut right = BTreeMap::new();
left.insert(a.spawn(Panic::Never), ());
left.insert(b.spawn(Panic::Never), ());
left.insert(c.spawn(Panic::Never), ());
right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
right.insert(c.spawn(Panic::Never), ());
catch_unwind(move || left.merge(right, |_, _, _| ())).unwrap_err();
assert_eq!(a.dropped(), 1);
assert_eq!(b.dropped(), 2);
assert_eq!(c.dropped(), 2);
}The |
|
I added a panic test within conflict, but I think due to #47949, it isn't incrementing the counter for |
|
r? @jhpratt rustbot has assigned @jhpratt. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
Mark was the reviewer of my previous BTreeMap merge PR (which uses double iterator) and I recall that jhpratt mentioned he's stepping off rotation for a bit last week on the previous BTreeMap PR (not sure if it's true right now), so I'm just rolling this review to Mark. |
|
I put myself back on rotation, but I don't mind being reassigned :) It would've been a couple days before I could review anyways. |
| /// | ||
| /// If the cursor is at the end of the map then the function is not called | ||
| /// and this essentially does not do anything. | ||
| pub(super) fn with_next(&mut self, f: impl FnOnce(&K, V) -> V) { |
There was a problem hiding this comment.
I feel like with_next having mutating action on self is not particularly intuitive. Maybe we can call this something like replace_next? But also it seems like at least the current impl can just be inlined into the call site and we can remove this entirely?
There was a problem hiding this comment.
I'm fine with renaming it to replace_next(). I also can see it being inlined as well due to how short the code is and we don't need that function call overhead.
I was hoping this would be re-implemented in the future such that it actually replaces the content of the next node (removing the entry on panic) rather than remove always and then insert. Is there a way to make a note of this optimization neatly?
There was a problem hiding this comment.
I'd probably cut an issue with some text describing the proposed optimization (and why we haven't done it) and then add a FIXME comment in both(?) places we'd inline the implementation to. Easier to read right now and once the implementation is proposed it can refactor depending on how it looks exactly.
There was a problem hiding this comment.
I created the issue here with this optimization (and other optimizations) to be resolved for BTreeMap::merge.
| } | ||
| } else { | ||
| // SAFETY: reaching here means our cursor is at the end | ||
| // self BTreeMap so we just insert other_key here |
There was a problem hiding this comment.
It seems like this is also probably inefficient: we'd want to do a bulk build here, especially if we're merging in a much larger other map into self. I think this is similar to the linear search above on Ordering::Greater.
I think it's worth at least adding an unresolved question and/or expanding the merge function's doc comment to say something about merges with expected few-matches being better done with e.g. self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>(), given a sorted_zip that does the advancing of left/right (merge sort's merge effectively). I'm not sure what the constant factors are -- probably depends on how expensive allocation and moving the keys/values around is -- but it seems remiss not to mention it.
There was a problem hiding this comment.
It seems like this is also probably inefficient: we'd want to do a bulk build here, especially if we're merging in a much larger other map into self.
I agree that bulk building has to occur in the else condition, where we know that our other map leftovers keys are greater than self map.
I think this is similar to the linear search above on Ordering::Greater.
Yea, this needs a method that acts similarly to BTreeMap::lower_bound_mut/BTreeMap::upper_bound_mut for Cursor*, so that we don't have multiple .next() calls and we can traverse through the underlying map structure efficiently in logarithmic time.
I think it's worth at least adding an unresolved question and/or expanding the merge function's doc comment to say something about merges with expected few-matches being better done with e.g. self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>(), given a sorted_zip that does the advancing of left/right (merge sort's merge effectively)
I'll leave a FIXME here noting how this could be improved with batching/bulk pushing the items (which could possibly be another method that Cursor* should be able to support?). My only hesitation with the example code self.into_iter().sorted_zip(other.into_iter()).collect::<BTreeMap<_, _>>() is how collecting an iterator into a BTreeMap works. I don't imagine that it has knowledge that the iterator is sorted, so it would end up taking O(nlg(n)) since you have n elements from the sorted merged iterator and if it's traversing from root, it could take lg(n) time to find the right spot to insert the element. There is a bulk_build_from_sorted_iter() method that BTreeMap has, but that's private.
There was a problem hiding this comment.
Yeah, right now it collects into a Vec and then sorts that:
rust/library/alloc/src/collections/btree/map.rs
Lines 2385 to 2393 in b3869b9
I suspect that implies that code shouldn't get written as-is exactly in most cases, but we are missing some kind of bulk construction primitive. I don't know how commonly that would actually matter -- I've had fairly few cases I'm doing anything with large BTrees personally, much less merging them -- so it's not clear we need/want a public API. On the other hand, maybe there's an API shape that avoids the merge operation being needed.
There was a problem hiding this comment.
I think a bulk construction that could work if we're able to traverse through the BTreeMap directly and ascend and descend through the levels, inserting it into the appropriate places.
I think doing that would at least minimize the best case work of the inputs being sorted already to O(n), and in the avg/worst case it would take O(nlgn), since in the sorted case it'll take O(1) to find the spot to insert the item relative to the current node that we're looking at, and in the unsorted case we would need to traverse through the map and entries to find the right spot (which would be O(lgn)). I don't know if that would be faster, but maybe worth experimenting and benchmarking?
|
If you can squash the commits a little (not sure how much useful is in the history, but seems like there's some that ought to get removed), r=me |
…o BTreeMap::append)
…s on bulk inserting other_keys into self map, and inlined with_next() insertion on conflicts from Cursor*
e6f0d43 to
cbdcfca
Compare
|
@Mark-Simulacrum squashed the commits into 3 separate commits. Lmk if there's anything else I need to do! |
Rollup of 12 pull requests Successful merges: - #149169 (ptr::replace: make calls on ZST null ptr not UB) - #150562 (Fix doc link used in suggestion for pinning self) - #152418 (`BTreeMap::merge` optimized) - #152679 (rustc_expand: improve diagnostics for non-repeatable metavars) - #152952 (mGCA: improve ogca diagnostic message ) - #152977 (Fix relative path handling for --extern-html-root-url) - #153017 (Implement debuginfo for unsafe binder types) - #152868 (delete some very old trivial `Box` tests) - #152922 (rustc_public: Make fields that shouldn't be exposed visible only in `rustc_public`) - #153032 (Fix attribute parser and kind names.) - #153051 (Migration of `LintDiagnostic` - part 3) - #153060 (Give a better error when updating a submodule fails)
Rollup merge of #152418 - asder8215:btreemap_merge_optimized, r=Mark-Simulacrum `BTreeMap::merge` optimized This is an optimized version of #151981. See [ACP](rust-lang/libs-team#739 (comment)) for more information on `BTreeMap::merge` does. CC @programmerjake. Let me know what you think of how I'm using `CursorMut` and `IntoIter` here and whether the unsafe code here looks good. I decided to use `ptr::read()` and `ptr::write()` to grab the value from `CursorMut` as `V` than `&mut V`, use it within the `conflict` function, and overwrite the content of conflicting key afterward. I know this needs some polishing, especially with refactoring some redundant looking code in a nicer way, some of which could probably just be public API methods for `CursorMut`. It does pass all the tests that I currently have for `BTreeMap::merge` (inspired from `BTreeMap::append`) though, so that's good.
|
@rust-timer build 73143b0 For #153074. |
|
Missing artifact for sha |
|
@rust-timer build 73143b0 |
|
Queued 73143b0 with parent 11ad63a, future comparison URL. |
This is an optimized version of #151981. See ACP for more information on
BTreeMap::mergedoes.CC @programmerjake. Let me know what you think of how I'm using
CursorMutandIntoIterhere and whether the unsafe code here looks good. I decided to useptr::read()andptr::write()to grab the value fromCursorMutasVthan&mut V, use it within theconflictfunction, and overwrite the content of conflicting key afterward.I know this needs some polishing, especially with refactoring some redundant looking code in a nicer way, some of which could probably just be public API methods for
CursorMut. It does pass all the tests that I currently have forBTreeMap::merge(inspired fromBTreeMap::append) though, so that's good.