PriorityQueue: optimize where we collect then iterate by using O(N) heapify [LUCENE-10302]

Looking at #9918 (LargeNumHitsTopDocsCollector.java ) I got to wondering if there was faster-than O(N\*log(N)) way of loading a PriorityQueue when we provide a bulk array to initialize the heap/PriorityQueue.  It turns out there is: the JDK's PriorityQueue supports this in its constructors, referring to "This classic algorithm due to Floyd (1964) is known to be O(size)" – heapify() method.  There's [another](https://www.geeksforgeeks.org/building-heap-from-array/)  that may or may not be the same; I didn't look too closely yet.  I see a number of uses of Lucene's PriorityQueue that first collects values and only after collecting want to do something with the results (typical / unsurprising).  This lends itself to a builder pattern that can look similar to LargeNumHitsTopDocsCollector in terms of first having an array used like a list and then move over to the PriorityQueue if/when it gets full (it may not).



---
Migrated from [LUCENE-10302](https://issues.apache.org/jira/browse/LUCENE-10302) by David Smiley (@dsmiley), updated Mar 04 2022
Attachments: [LUCENE_PriorityQueue_Builder_with_heapify.patch](https://apache.github.io/lucene-jira-archive/attachments/LUCENE-10302/LUCENE_PriorityQueue_Builder_with_heapify.patch)
Linked issues:
 - #9918



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PriorityQueue: optimize where we collect then iterate by using O(N) heapify [LUCENE-10302] #11338

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PriorityQueue: optimize where we collect then iterate by using O(N) heapify [LUCENE-10302] #11338

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions