-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Swap out some simple PriorityQueue subclasses for one using a Comparator #14705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
private void writeCompoundFile( | ||
IndexOutput entries, IndexOutput data, Directory dir, SegmentInfo si) throws IOException { | ||
// write number of files | ||
int numFiles = si.files().size(); | ||
entries.writeVInt(numFiles); | ||
// first put files in ascending size order so small files fit more likely into one page | ||
SizedFileQueue pq = new SizedFileQueue(numFiles); | ||
List<SizedFile> files = new ArrayList<>(numFiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one doesn't need to use a PriorityQueue at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting should also be faster here. The Lucene PQ is only needed when items in queue should fall out at botton when its full. In other cases the ln-overhaed is larger than a simple sorting at end.
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
1 similar comment
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
Missed a comparator
1f07b03
to
f53ae10
Compare
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
Next stage is to run some perf tests to check this doesn't introduce a slowdown due to changes to the virtual method calls in key hotspots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Why not make PQ final and pass a comparator or maybe a simpler functional interface directly in ctor. This would reduce the number of subclasses more and we only have a single PQ which is final on top.
But it does not reduce the number of loaded classes. The lambdas are still hidden classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, it looks more like idiomatic Java. Can you check if OrdinalMapBenchmark
reports any slowdown? This is the only benchmark that could be impacted by this change that I can think of.
It's slightly awkward that a PriorityQueue
can be defined either using a comparator or overriding lessThan
, let's only support providing a comparator as Uwe suggested?
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. I feel tempted to make PriorityQueue final and then move the comparator to a final field. Why bother having both ways of defining it?
This is only the first pass, covering single-line comparators. There are several non-trivial implementations that will require more detailed refactoring (if that is indeed desirable) - using a Comparator requires differentiating equality, which |
Good point, thanks. |
We could also declare a custom interface with just lessThan and then provide an adapter in usingComparator... this would dodge the need for implementing full equality checks across the board and you could still use jdk's Comparator utilities, where things are simple. |
That's a good idea - I'll look at that when I'm moving onto the more complex implementations |
We should really only have one final PQ implementation to make sure most of the code can be inlined. |
Basically the lessThan interface should be marked |
That's the desired end state, yes. There's still plenty of implementations to get through before we can make it final. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to see what the benchmarks show after this change. These pq's are sort of crucial to performance. I wonder if we can observe any change after subclasses are replaced by delegation.
In general, I'm fine to making baby steps although it could also be developed into a larger patch that adds this additional "smaller-than" interface and makes the PQ final. I wonder what others think.
I wasn't able to run OrdinalMapBenchmark, but a few lucene-bench runs show no large changes - several percent +/- either way |
For reference, there is an interesting PR at #14714 that replaces |
I ran it on my machine. Got the following results on main (2 runs):
And the following results on your branch (2 runs as well):
So it looks fine to me as far as this benchmark is concerned. |
My intuition was that this "smaller-than" interface idea you brought up would be relatively easy to introduce to all our |
There are some non-trivial subclasses ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with merging, given the explanation. Please add a note to changes and migration? Seems like something worth mentioning.
I'm fine with merging too. |
Merged. Thank you, @thecoop |
Thanks! Cool first step! |
Related to #11338 (comment), this replaces some simple uses of
PriorityQueue
subclasses with ones using aComparator
. This reduces the boilerplate required, the number of classes defined, and moves the definition of thePriorityQueue
close to where it is used, so the relevant code is more self-contained and easier to understand.