-
Notifications
You must be signed in to change notification settings - Fork 508
[collections-838] hasNext() value is now cached till next() or remove… #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[collections-838] hasNext() value is now cached till next() or remove… #633
Conversation
…() get invoked, thus reducing call load on hasNext() dratically especially in nested chained iterator scenarios
…meoutPreemptively() for tests
ppkarwasz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JoergBudi,
Thanks for working on this! 💯
I think we could take the optimization a step further by caching both the result of hasNext() and the value returned by next() inside updateCurrentIterator, similarly to what FilterIterator already does:
commons-collections/src/main/java/org/apache/commons/collections4/iterators/FilterIterator.java
Lines 43 to 47 in d8c59cd
| /** The next object in the iteration. */ | |
| private E nextObject; | |
| /** Whether the next object has been calculated yet. */ | |
| private boolean nextObjectSet; |
The reason I originally thought caching the next element from the nested iterator wasn’t possible was because of this comment:
commons-collections/src/main/java/org/apache/commons/collections4/iterators/IteratorChain.java
Lines 270 to 272 in d8c59cd
| // set last used iterator here, in case the user calls remove | |
| // before calling hasNext() or next() (although they shouldn't) | |
| lastUsedIterator = currentIterator; |
However, that comment doesn’t seem accurate, and in fact we may not even need lastUsedIterator.
Here’s what I propose:
- Remove
lastUsedIteratorand just track the iterator used in the lasthasNext()ornext()call (i.e.,currentIterator). - Add two fields:
hasNextObject(Boolean) andnextObject(E) — to cache the upcoming element. - Since
updateCurrentIteratorisprotectedbut only modifies private fields (no observable effects), deprecate it and make it a no-op. - Introduce a new
setNextObject()method that handles allhasNext()andnext()calls on the chained iterators, ensuring no duplicatehasNext()invocations (the current cause of the exponential performance hit). It could returntrueif an object was set. - Keep
remove()simple: check ifnext()has been called and throwIllegalStateExceptionotherwise, following the same contract as, for example,ArrayList’s iterator.
This approach should solve the double hasNext() problem cleanly and bring the complexity down without making the code harder to maintain.
What do you think?
src/main/java/org/apache/commons/collections4/iterators/IteratorChain.java
Outdated
Show resolved
Hide resolved
…when hasNext() has been invoked already
…-before bug, further optimization to save hasNext() calls
|
@ppkarwasz : Your suggestion with setNextObject() doesn't work, because you can't "prefetch" the next element without losing the possibility to properly invoke "remove()" down the chain. |
ppkarwasz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JoergBudi,
Your suggestion with
setNextObject()doesn't work, because you can't "prefetch" the next element without losing the possibility to properly invokeremove()down the chain.
You’re right: with prefetching, a sequence like next(), hasNext(), remove() would end up removing the wrong element from the nested iterator.
Your changes look good to me. Could you please add an entry to src/changes/changes.xml with dev="pkarwasz" (with one “p”) as the merging committer?
src/test/java/org/apache/commons/collections4/SetUtilsTest.java
Outdated
Show resolved
Hide resolved
|
@JoergBudi, thanks! 🎉 |
…() get invoked, thus reducing call load on hasNext() drastically in nested chained iterator scenarios.
Before this change, the call-load on hasNext() grew exponentially with the number of nested IteratorChains.
@ppkarwasz: I think, the fix is indeed this easy, please check. It raises the question, if the fix for collections-722 is even needed anymore ... (the discussion in parallel on PR 628).
Thanks for your contribution to Apache Commons! Your help is appreciated!
Before you push a pull request, review this list:
mvn; that'smvnon the command line by itself.