Skip to content

Should TopScoreDocCollector Always Populate Sentinel Values? [LUCENE-8875] #9918

Closed
@asfimport

Description

@asfimport

TopScoreDocCollector always initializes HitQueue as the PQ implementation, and instruct HitQueue to populate with sentinels. While this is a great safety mechanism, for very large datasets where the query's selectivity is high, the sentinel population can be redundant and can become a large enough bottleneck in itself. Does it make sense to introduce a new parameter in TopScoreDocCollector which uses a heuristic (say number of hits > 10k) and does not populate sentinels?


Migrated from LUCENE-8875 by Atri Sharma (@atris), resolved Jul 10 2019
Linked issues:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions