-
-
Notifications
You must be signed in to change notification settings - Fork 509
Provide rewindable option to Aggregation and Query builders/queries #2116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
So for the heads up, @alcaeus wants to evaluate one other approach/idea (but I'll let him do the explaining) and if it doesn't make sense, we'll go with this :) |
@malarzm I've talked a little with @alcaeus over on slack about it, and he didn't seem sure about his approach. As far as I understood it, it revolved around creating a However, either way I believe some way of allowing control over what iterators are added to iterator chain will be probably needed, whether it will be I've added commits with new I've also slightly changed the way private function wrapTraversable(Traversable $traversable) : Generator
{
foreach ($traversable as $key => $value) {
$this->items[$key] = $value;
yield $key => $value;
$this->iteratorAdvanced = true;
}
$this->iterator = null;
} I've removed I've wondered if those two iterators - There are still some code quality checks / comment updates that I need to address, but I'd like to know your opinion if this approach is a valid one. See new commits: https://github.com/doctrine/mongodb-odm/pull/2116/files/6e26070f7ad7a89f33bc5b1042133e0886118c4d..a312708b991c7c36d8bdaa1b581a70e007dff620 Also, I believe I should target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this logic! I've left some feedback in the code itself. Just a couple of points here:
- I'm not sure if the refactoring in
CachingIterator
is necessary. If it isn't, I'd prefer leaving it out as long as it doesn't have a performance impact. - The code currently produces a lot of phpcs errors: https://travis-ci.org/doctrine/mongodb-odm/jobs/623925579#L2854. If you've installed dependencies locally, you can run
vendor/bin/phpcbf
to fix most of them. Then, runvendor/bin/phpcs
to see what errors couldn't be fixed and fix them manually. - PHPStan also complains about your code: https://travis-ci.org/doctrine/mongodb-odm/jobs/623925578#L1426. Reverting the refactoring in
CachingIterator
will fix some of them, while applying thegetIterator()
method toBaseIterator
and updating the$iterator
type hint as suggested below will fix the rest of them. - This PR currently contains commits from another branch. I'll merge up through the branches soon, which should take care of that. We can worry about rebasing and squashing commits once the implementation is solid.
public function __construct(Traversable $iterator) | ||
{ | ||
$this->iterator = $this->wrapTraversable($iterator); | ||
$this->storeCurrentItem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this call is superfluous: wrapTraversable
stores the current key and value in $this->key
and $this->value
in the foreach loop, so I'm wondering why we still need this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's required to cause generator to reach first yield
. Otherwise Cursor
is not starting it's execution, which differs from what CachingIterator
does. When creating this I worked under the assumption that this should not be changed.
Should I change the method name to something else, and if yes, what would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the method completely and only left $this->iterator->key()
in __construct
which - as far as I understand it, I may be wrong - starts up the generator causes Cursor
to query and execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the initial foreach
already initialise the cursor and start iterating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<?php
function wrapTraversable(Traversable $traversable) {
echo("ITERATOR EXECUTING\n");
foreach ($traversable as $key => $value) {
yield $key => $value;
}
};
$generator = wrapTraversable(new ArrayIterator([1, 2, 3]));
echo "BEFORE KEY\n";
$generator->key();
echo "AFTER KEY\n";
foreach ($generator as $key => $value) {
echo $value;
}
Output:
/usr/bin/php /home/steveb/PhpstormProjects/adpanel/adpanel/test.php
BEFORE KEY
ITERATOR EXECUTING
AFTER KEY
123
Therefore, because $this->iterator
is an instance of a Generator
, it does not start executing (even until the first yield
statement) until any call to the usual Iterator
methods.
CachingIterator
is - as far as I understand it - used to immediately start the Cursor
and this call does exactly that, similarly to how storeCurrentItem
was doing.
Sorry for the long delay. Logic looks good now. Could you please add tests for the builder/query functionality to make sure no The QueryTest and BuilderTest classes (for Aggregation and Query) serve as examples. After that, this is ready for a merge into 2.1 :) |
@alcaeus I'll try to wrap this up tonight/tomorrow. |
No worries, caught up in things as well, so I can only finish it up next week anyway. |
I've added tests in 9f347c6. If I misunderstood their purpose - being too generic or too narrow - please correct me. They should not pass since I've deliberately marked them as expecting an exception to be thrown to show a little quirk/issue that those tests uncovered: empty resultsets will incorrectly not mark iterator as exhausted. This is due to the fact that foreach loop in wrapping EDIT: We can also simply ignore this particular behavior with empty results - they would appear as if they are rewindable - but I'd prefer to ask for your opinion beforehand. EDIT2: I've noticed that |
@Steveb-p once again, sorry for the long delay here, but I was not able to do any significant work over the past few weeks due to the current situation. I updated the tests so they pass, and also added paragraphs to the query builder and aggregation builder documentation explaining this feature. Once I've got the other branches stable and merge up, this is good to merge. Thanks for highlighting this issue, fixing it, and bringing along more patience a maintainer could ask for! ❤️ |
Summary
Proof of concept for allowing user control over wrapping resulting
Traversable
/Iterator
withCachingIterator
- to prevent memory issues when iterating over large result sets.Note: This PR is incomplete - still needs
tests, relevant comment blocks anddocumentation adjustments.