Skip to content

HybridDirectory should mmap postings. #52641

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 27, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -152,15 +152,27 @@ public void close() throws IOException {
boolean useDelegate(String name) {
String extension = FileSwitchDirectory.getExtension(name);
switch(extension) {
// We are mmapping norms, docvalues as well as term dictionaries, all other files are served through NIOFS
// this provides good random access performance and does not lead to page cache thrashing.
// Norms, doc values and term dictionaries are typically performance-sensitive and hot in the page
// cache, so we use mmap, which provides better performance.
case "nvd":
case "dvd":
case "tim":
// We want to open the terms index and KD-tree index off-heap to save memory, but this only performs
// well if using mmap.
case "tip":
case "cfs":
case "dim":
// Compound files are tricky because they store all the information for the segment. Benchmarks
// suggested that not mapping them hurts performance.
case "cfs":
// MMapDirectory has special logic to read long[] arrays in little-endian order that helps speed
// up the decoding of postings. The same logic applies to positions (.pos) of offsets (.pay) but we
// are not mmaping them as queries that leverage positions are more costly and the decoding of postings
// tends to be less a bottleneck.
case "doc":
return true;
// Other files are either less performance-sensitive (e.g. stored field index, norms metadata)
// or are large and have a random access pattern and mmap leads to page cache trashing
// (e.g. stored fields and term vectors).
default:
return false;
}
Expand Down