From 95f9af09fc2f2b3763e38dee3088551075f40afd Mon Sep 17 00:00:00 2001 From: James Liu <777445+root-goblin@users.noreply.github.com> Date: Sun, 1 Sep 2024 15:27:48 -0400 Subject: [PATCH] Clarify docs for std::collections Page affected: https://doc.rust-lang.org/std/collections/index.html#performance Changes: - bulleted conventions - expanded definitions on terms used - more accessible language - merged Sequence and Map performance cost tables --- std/src/collections/mod.rs | 62 +++++++++++++++++++++----------------- 1 file changed, 35 insertions(+), 27 deletions(-) diff --git a/std/src/collections/mod.rs b/std/src/collections/mod.rs index 3b04412e76630..21bebff96b956 100644 --- a/std/src/collections/mod.rs +++ b/std/src/collections/mod.rs @@ -79,41 +79,49 @@ //! see each type's documentation, and note that the names of actual methods may //! differ from the tables below on certain collections. //! -//! Throughout the documentation, we will follow a few conventions. For all -//! operations, the collection's size is denoted by n. If another collection is -//! involved in the operation, it contains m elements. Operations which have an -//! *amortized* cost are suffixed with a `*`. Operations with an *expected* -//! cost are suffixed with a `~`. +//! Throughout the documentation, we will adhere to the following conventions +//! for operation notation: //! -//! All amortized costs are for the potential need to resize when capacity is -//! exhausted. If a resize occurs it will take *O*(*n*) time. Our collections never -//! automatically shrink, so removal operations aren't amortized. Over a -//! sufficiently large series of operations, the average cost per operation will -//! deterministically equal the given cost. +//! * The collection's size is denoted by `n`. +//! * If a second collection is involved, its size is denoted by `m`. +//! * Item indices are denoted by `i`. +//! * Operations which have an *amortized* cost are suffixed with a `*`. +//! * Operations with an *expected* cost are suffixed with a `~`. //! -//! Only [`HashMap`] has expected costs, due to the probabilistic nature of hashing. -//! It is theoretically possible, though very unlikely, for [`HashMap`] to -//! experience worse performance. +//! Calling operations that add to a collection will occasionally require a +//! collection to be resized - an extra operation that takes *O*(*n*) time. //! -//! ## Sequences +//! *Amortized* costs are calculated to account for the time cost of such resize +//! operations *over a sufficiently large series of operations*. An individual +//! operation may be slower or faster due to the sporadic nature of collection +//! resizing, however the average cost per operation will approach the amortized +//! cost. //! -//! | | get(i) | insert(i) | remove(i) | append | split_off(i) | -//! |----------------|------------------------|-------------------------|------------------------|-----------|------------------------| -//! | [`Vec`] | *O*(1) | *O*(*n*-*i*)* | *O*(*n*-*i*) | *O*(*m*)* | *O*(*n*-*i*) | -//! | [`VecDeque`] | *O*(1) | *O*(min(*i*, *n*-*i*))* | *O*(min(*i*, *n*-*i*)) | *O*(*m*)* | *O*(min(*i*, *n*-*i*)) | -//! | [`LinkedList`] | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*)) | *O*(1) | *O*(min(*i*, *n*-*i*)) | +//! Rust's collections never automatically shrink, so removal operations aren't +//! amortized. //! -//! Note that where ties occur, [`Vec`] is generally going to be faster than [`VecDeque`], and -//! [`VecDeque`] is generally going to be faster than [`LinkedList`]. +//! [`HashMap`] uses *expected* costs. It is theoretically possible, though very +//! unlikely, for [`HashMap`] to experience significantly worse performance than +//! the expected cost. This is due to the probabilistic nature of hashing - i.e. +//! it is possible to generate a duplicate hash given some input key that will +//! requires extra computation to correct. //! -//! ## Maps +//! ## Cost of Collection Operations //! -//! For Sets, all operations have the cost of the equivalent Map operation. //! -//! | | get | insert | remove | range | append | -//! |--------------|---------------|---------------|---------------|---------------|--------------| -//! | [`HashMap`] | *O*(1)~ | *O*(1)~* | *O*(1)~ | N/A | N/A | -//! | [`BTreeMap`] | *O*(log(*n*)) | *O*(log(*n*)) | *O*(log(*n*)) | *O*(log(*n*)) | *O*(*n*+*m*) | +//! | | get(i) | insert(i) | remove(i) | append(Vec(m)) | split_off(i) | range | append | +//! |----------------|------------------------|-------------------------|------------------------|-------------------|------------------------|-----------------|--------------| +//! | [`Vec`] | *O*(1) | *O*(*n*-*i*)* | *O*(*n*-*i*) | *O*(*m*)* | *O*(*n*-*i*) | N/A | N/A | +//! | [`VecDeque`] | *O*(1) | *O*(min(*i*, *n*-*i*))* | *O*(min(*i*, *n*-*i*)) | *O*(*m*)* | *O*(min(*i*, *n*-*i*)) | N/A | N/A | +//! | [`LinkedList`] | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*)) | *O*(min(*i*, *n*-*i*)) | *O*(1) | *O*(min(*i*, *n*-*i*)) | N/A | N/A | +//! | [`HashMap`] | *O*(1)~ | *O*(1)~* | *O*(1)~ | N/A | N/A | N/A | N/A | +//! | [`BTreeMap`] | *O*(log(*n*)) | *O*(log(*n*)) | *O*(log(*n*)) | N/A | N/A | *O*(log(*n*)) | *O*(*n*+*m*) | +//! +//! Note that where ties occur, [`Vec`] is generally going to be faster than +//! [`VecDeque`], and [`VecDeque`] is generally going to be faster than +//! [`LinkedList`]. +//! +//! For Sets, all operations have the cost of the equivalent Map operation. //! //! # Correct and Efficient Usage of Collections //!