Skip to content

Fastest way to initialize a vector is not documented #54628

Closed

Description

There are at least thee distinct ways to create a zero-filled vector with a certain capacity:

// resize
let mut vec1 = Vec::with_capacity(len);
vec1.resize(len, 0);
// extend
let mut vec2 = Vec::with_capacity(len);
vec2.extend(repeat(0).take(len))
// vec! macro
let mut vec3 = vec![0; len];

Despite the latter being the most concise one, other solutions also show up in real-world code. The performance characteristics of these solutions are not obvious, and not documented - at least on the Vec page in stdlib reference.

This has led to an actual security vulnerability in Claxon crate, now known as RUSTSEC-2018-0004. On some malformed inputs contents of uninitialized memory would show up in the output. See the original bug report or the security advisory for more details.

Like most binary format decoders, Claxon writes into a preallocated buffer. Memory unsafety that led to the vulnerability was introduced to speed up initialization of the vector. Initialization was originally performed like this: buffer.extend(repeat(0).take(new_len - len)), but was replaced with unsafe { buffer.set_len(new_len); } for performance (see relevant commit).

Anything that desugars into RawVec::with_capacity_zeroed() is dramatically more efficient than .extend(), at least on Linux. One public wrapper for this function is vec! macro. Replacing vec::with_capacity(new_len); unsafe { buffer.set_len(new_len); } with vec![0; new_len]; created no measurable performance difference in Claxon, eliminating the need to unsafe code.

I believe that clearly documenting that the vec! macro has special behavior when given 0 as argument and is dramatically more efficient than other means of initialization (at least on some platforms) would have prevented this vulnerability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    A-docsArea: documentation for any part of the project, including the compiler, standard library, and tools

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions