Skip to content

Vec's reallocation strategy needs settling #29931

Open
@Gankra

Description

@Gankra

Background

Currently, Vec's documentation frequently contains the caveat that "more space may be reserved than requested". This is primarily in response to the fact that jemalloc (or any other allocator) can actually reserve more space than you requested because it relies on fixed size-classes to more effeciently dole out memory. (see the table here)

Jemalloc itself exposes a malloc_usable_size function which can be used to determine how much capacity was actually allocated for a pointer, as well as nallocx which can be used to determine how much capacity will be allocated for a given size and alignment. Vec can in principle query one of these methods and update its capacity field with the result.

The question at hand is: is this ever worth it to check usable_capacity, and is it ever undesirable?

This issue was kicked off by Firefox devs who have experience doing exactly this optimization, and claim it being profitable. Facebook has also claimed excellent dividends in making its allocation strategy more jemalloc friendly, but to my knowledge do not actually query usable_capacity.

Currently our alloction strategy is almost completely naive. One can see the source here, which boils down to:

let new_cap = if self.cap == 0 { 
    4
} else { 
    2 * self.cap 
}

To the best of my knowledge, this is all the capacity logic for FBVector, which boils down to:

let new cap = if self.cap == 0 {
    max(64 / size_of_elem, 1)   // empty
} else if self.cap < 4096 / size_of_elem) {
    self.cap * 2                // small
} else if self.cap > 4096 * 32 / size_of_elem) {
    self.cap * 2                // huge
} else {
    (self.cap * 3 + 1) / 2      // moderate
}

The only major deviation from Rust today being a 1.5 growth factor for "moderate" sized allocations. Note that there is a path in their small_vector type that queries malloc_usable_size.

Unfortunately I've been unable to find good information on what Firefox does here. The best I could find is this discussion which demonstrates big wins -- 6.5% less calls to malloc and 11% less memory usage. However the effort seems to peter out when it is asserted that this is obsoleted by Gecko just using power-of-two capacities. mxr and dxr seem to suggest it's only used for statistics.

Hopefully Firefox and Facebook devs can chime in on any experiences.

Going Forward

I have several outstanding concerns before we push further on this matter.

Rust is not C++, so I am partially suspicious of any benchmarks that demonstrate value in C++. In particular, the proliferation of size_hint and extend could completely blast away any need to be clever with capacities for most programs. I would also expect a large Rust application to frob the allocator less in general just because we default to move semantics, and don't have implicit assignment/copy constructors. Also, Rust programs are much more "reckless" with just passing around pointers into buffers because the borrow checker will always catch misuse. Slices in particular make this incredibly ergonomic. We need Rust-specific benchmarks. I'm sure the Hyper, Servo, Glium, Serde, and Rust devs all have some interesting workloads to look at.

Rust's Vec is basically the community's malloc/calloc, because actual malloc is unstable and unsafe. We explicitly support extracting the Vec's pointer and taking control of the allocation. In that regard I believe it is desirable for it to be maximally well-behaved and performant for "good" cases (knowing exactly the capacity you want, having no use for extra capacity). There's a certain value in requesting a certain capacity, and getting the capacity. Both nallocx and malloc_usable_size are virtual function calls with non-trivial logic, and may have unacceptable overhead for responsible users of Vec.

Note that anything we do must enable Vec and Box<[T]> to reliably roundtrip through each other without reallocating in certain cases. If I invoke shrink_to_fit or with_capacity, it ought to not reallocate when I try to convert to a Box<[T]>. As far as I can tell, this should be possible to uphold even when using malloc_usable_size because jemalloc is "fuzzy" and only requires that the given size is somewhere between the requested one and the usable one.

Anything we do must also work with allocators that aren't jemalloc. This may be as simple as setting usable_size = requested_size for every allocator that isn't jemalloc.

CC @pnkfelix @erickt @reem @seanmonstar @tomaka @SimonSapin @larsberg @pcwalton @Swatinem @nnethercote

CC #29848 #29847 #27627

CC @rust-lang/libs

EDIT(workingjubilee, 2023-05-06): Originally one of the links touched on the master branch of folly instead of a permalink to a specific commit. Linkrot claimed the one to jemalloc's docs. These have been updated with a permalink and a Wayback Machine link, respectively.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-allocatorsArea: Custom and system allocatorsA-collectionsArea: `std::collections`C-enhancementCategory: An issue proposing an enhancement or a PR with one.Libs-TrackedLibs issues that are tracked on the team's project board.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions