Skip to content

Document PrefixSet contains performance characteristics #16208

Closed
@Rjected

Description

@Rjected

Describe the feature

The PrefixSet::contains method currently takes &mut self:

/// Returns `true` if any of the keys in the set has the given prefix
#[inline]
pub fn contains(&mut self, prefix: &[u8]) -> bool {
if self.all {
return true
}
while self.index > 0 && &self.keys[self.index] > prefix {
self.index -= 1;
}
for (idx, key) in self.keys[self.index..].iter().enumerate() {
if key.has_prefix(prefix) {
self.index += idx;
return true
}
if key > prefix {
self.index += idx;
return false
}
}
false
}

From an outsider's perspective, this can be non-intuitive because most data structures have contains methods that only take &self.

This is the case in PrefixSet because we optimize for the case where we are looking up many keys in a row that are already in sorted, or prefix order.

See the PR which introduced the logic, as well as the linked benchmark PR that show the performance improvement:
#2417

And the code from silkworm that this was originally inspired by:
https://github.com/erigontech/silkworm/blob/63cb37700c3732b5ace2b31b1bc68ec795a0d17a/silkworm/core/trie/prefix_set.cpp#L30-L54

We should document why PrefixSet::contains works this way and what cases it's optimized for.

Additional context

No response

Metadata

Metadata

Assignees

Labels

A-trieRelated to Merkle Patricia Trie implementationC-docsAn addition or correction to our documentationC-enhancementNew feature or requestD-good-first-issueNice and easy! A great choice to get started

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions