-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient counting on Criteria #882
Conversation
Hello, thank you for creating this pull request. I have automatically opened an issue http://www.doctrine-project.org/jira/browse/DDC-2865 We use Jira to track the state of pull requests and the versions they got |
public function count() | ||
{ | ||
if (null !== $this->count) { | ||
return $this->count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is initialized, it needs to return the count for the inner collection, as it could have been modified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
for the LazyCollection, I suggest adding an abstract LazyCollection class in Doctrine Common which would expect a protected method |
@stof , I can do that on Doctrine Collection. (I really think it would be better to have a count method on the Selectable interface... is BC needed on this interface?). It would be nice if we could do : $selectable->count($criteria). The count method could be moved to the EntityRepository, instead of needing this collection. |
BC is needed on the interface. |
Understood. About the LazyCollection on Doctrine common, your main point is reducing the amount of duplicated code, however on PersistentCollection for instance, a lot of methods have custom behavior (https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/PersistentCollection.php#L484) so actually I'm not sure you really can remove a lot of code from it. Just tell me, and I'll do that today and write the tests. |
Well, it would remove the code for all non-lazy methods. And each O*M project in Doctrine has its own set of persistent collections, which would benefit from it |
I realized there is a EntityPersister interface that seems to be new in 2.5. May I add the getCountSQL and count methods to the interface or is it already too late? |
This breaks the tests for the new caching support. Please fix it. |
Just updated to master, let's see if Travis passes. @beberlei , what about this: doctrine/collections#19 This implementation on ORM will be much lighter if the lazy collection on Doctrine collections is merged :) |
Tests no longer should break too. |
@bakura10 they still fail for the same reason. Please run the tests locally, you should see the error.
|
/** | ||
* @var bool | ||
*/ | ||
protected $initialized = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You dont need this one, if ($this->collection === null)
is an equally working check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just using the same pattern as in PersistentCollection. Anyway, I think the LazyCollection on Doctrine common should really be needed, it would remove all the duplicated code :).
Hi, There were some issues related to the inheritance. I've fixed that and now all tests pass (at least locally). |
👍 |
How about the limit/offset/orderBy cases? It doesn't make sense for counting queries. |
@@ -877,6 +876,10 @@ public function matching(Criteria $criteria) | |||
|
|||
$persister = $this->em->getUnitOfWork()->getEntityPersister($this->association['targetEntity']); | |||
|
|||
return new ArrayCollection($persister->loadCriteria($criteria)); | |||
if ($this->association['fetch'] === ClassMetadataInfo::FETCH_EXTRA_LAZY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not sure about this @beberlei after given it some thoughts. Doesn't make it more sense to do a check against FETCH_LAZY instead of FETCH_EXTRA_LAZY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXTRA_LAZY seems to be more correct here. I don't see why we'd need any distinction though - can't it always be a LazyCriteriaCollection
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. To me there is no reason to check this, and always wrap around a lazy collection. The performance penalty is pretty low I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After giving it a thought, I've changed that to return a lazy collection IF the association fetch is not EAGER. This makes much more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if we look at it, a PersistentColleciton is already filtered by the owner. When fetch mode is LAZY, we expect it to load it all anyway if we do a $article->getComments() for instance. The only fetch mode where we expect it to be sliced is actually EXTRA_LAZY.
The thing is that if we do it for LAZY, it will usually trigger two queries (the count and the actual requests to fetch the data), which I suppose is not what we want.
@marcospassos , it makes sense to add "orderBy" to the count query, indeed. I'll add this later, however I'm not sure to understand limit/offset. This one does not make sense isn't it? |
@bakura10 why |
Current implementation does not take into account neither limit, group by or offset, so that's why I didn't understand your feedback ^^. Sorry for the ORDER BY, yes it does not make sense. I read "group by" instead of "order by" :o. |
I think it should be clear, at least in the documentation 👍 |
Hi, The last commit reuse the new abstract lazy collection from Doctrine Collections 1.2. This PR should be okey now :). |
@bakura10 I merged a slightly modified version of this PR. |
Awesome :D. Thanks ! However about caching the count, I'm not sure, as the criteria could be changed between two consecutive count calls (as it's passed by reference). It can be problematic of people use the lazycollectionCriteria themselves. Envoyé de mon iPhone
|
Hi @guilhermeblanco , First, thank you for having merged this. I've tried it in my code, it really works well and push the Criteria to the next level. However, I think we should have a discussion about this: https://github.com/doctrine/doctrine2/pull/882/files#diff-108586f774fc8acb02163ed709e05e86R886 Let's imagine the following code: class User
{
/**
* @ORM\OneToMany(targetEntity="Message", mappedBy="conversation")
*/
protected $messages;
} and class Message
{
/**
* @ORM\ManyToOne(targetEntity="User", inversedBy="messages")
*/
protected $user;
} With current configuration, count($user->getMessages()->matching(new Criteria())) will actually trigger a load of the whole collection (no wrap around a LazyCriteriaCollection). For the optimized COUNT to occur, we need to set the fetch mode to EXTRA_LAZY. I fully understand why @beberlei wanted to do this. Semantically, LAZY means "retrieve the whole collection when we first access it", so if we assume this, it does not make any sense to wrap around the LazyCriteriaCollection. However, from a usage point of view, I realized I actually often "forget" to set it to EXTRA_LAZY (because I use paginates that abstract that kind of things), so in all those cases, it does a full load of the table. My point is that maybe we should just allow to wrap around a LazyCriteria in all cases (LAZY and EXTRA_LAZY). After all, the overhead introduced by the LazyCollectionCriteria is quite small. What do you think? ping @Ocramius @macnibblet |
@bakura10 totally for it, but only when it's |
So you agree with the current implementation? Envoyé de mon iPhone
|
@bakura10 yes, I agree with current implementation. |
I've run into an interesting "issue" while using this PR. It's not really a problem but couldn't find a way to solve it. Basically, imagine you have a User => Tweets associations. If you match this association, you'll have a LazyCriteriaCollection. The PersistentCollection will automatically set a filter on the criteria here: https://github.com/doctrine/doctrine2/blob/master/lib/Doctrine/ORM/PersistentCollection.php#L880 However, if you set the same filter on the matching, it will have the same condition twice, and having SQL like "SELECT COUNT(*) FROM blabla WHERE blabla.id = :id AND blabla.id = :id" There is no easy way to iterate through the where expression to have a check, though. It's not really a problem and I'm not sure if performance wise this is a problem, but taht would be interesting to have a look at this :). |
@bakura10 can you open a new issue to discuss that? |
Hi,
This is an attempt to solve this very annoying issue: http://www.doctrine-project.org/jira/browse/DDC-2217
I'm not sure about my solution as I'm not really into Doctrine internals.
Use case
I have a library that is based exclusively on Criteria API. A Paginator adapter uses an empty Criteria to count how many elements are in the collection. However, EntityRepository's matching method always initialize the whole collection into an ArrayCollection. This is unusable in most cases and make the API unusable for my use case.
Solution
I've created a new LazyCollectionCriteria that implements an efficient count. I think the cleanest solution would be to add a method to the Selectable interface: with a count method. But I think this is not possible now, unfortunately.
If the solution seems nice to you, I'll write the tests.
ping @Ocramius, @macnibblet and @Thinkscape (this should make ZfrRest usable :D)