-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NFR] Eager loading support of Phalcon\Mvc\Model #1117
Comments
+1 |
2 similar comments
+1 |
+1 |
Your can use leftJoint at another ORM but in phalcon leftJoin relations isn`t selected to object only run in sql #1111? |
+1 |
1 similar comment
+1 |
What's the status of this one? Or do I have to give up phalcon?.. |
+1 |
2 similar comments
+1 |
+1 |
+1 Can we expect anything to happen with it? It's quite a basic feature I think, it should be handled with high priority if I may categorize its severity. But at least a tiny feedback would be good about it to know if we can expect it to be implented or should learn a different ORM. Because without this feature this ORM handles only separate entities, not entities with relations. Lazy loading is only a tiny help but doesn't care about data integrity and mostly not about performance. But on the other hand, Phalcon is so good, should make its ORM to be usable as well, big +1 vote for this feature. |
+1 |
1 similar comment
+1 |
Is there an alternative to solving this N+1 problem? |
I could think of creating stubs and populating explicitly joined data in those stubs, for other unavailable attributes I'd fall back to lazy loading/ load them once requested. |
+1 |
5 similar comments
+1 |
+1 |
+1 |
+1 |
+1 |
+10 |
+1 This is the only workaround I could find:
|
Many Thanx |
Is there at least some news regarding this functional? It really must have functional. |
+1 |
2.0 is released. Please add this functionality, it is the only thing that spoils the impression from working with the framework. This problem for 2 years (in fact). |
@andresgutierrez, yes, it can also be done, but that would be correct if I do something: $comments = Comments::query()->with(['author', 'author.skin'])->...;
foreach($comments as $comment) {
echo $comment->text;
echo $comment->author->nickname;
echo $comment->author->skin->faceUrl;
} At least in Yii2 it is applied such method than to access relations through StdClass. |
I think the age of using DEEP RELATIONs an also eager loadig is done! We should think different. Redesign UXs an data manipulations... Eager loading has so much disadvantages nowadays! Sorry but that is my POV. |
@erickskrauch But, if you only need $comment->text why do you have to fetch the full |
@andresgutierrez, I just show you example of real using eager loading in real project. Many people are puzzled, using PhalconPHP, how to get one select for model and related data. Perhaps in PhalconPHP a different approach to it (as shown by you through PHQL), but no such examples in the documentation. On the other hand Yii (Yii2), Laralev, etc. provide a method with() and the developers would like to have this in PhalconPHP. Otherwise would not have been much of an issue here on github. |
Eager loading in laravel and yii isn't real eager loading if you look at the database logs. I always see eloquent make two queries. One for the base model, then another for the related models via an IN(). What would really be nice in Phalcon is to have the same syntax, but that syntax be a join on a relationship. We would, of course need to specify columns, but I feel like this grants the best mix of utility and performance. We hydrate the result set as instances of the base model with the related columns attached via properties. I.e. I join do Robot->with("parts",["name"]) I would be able to reference that as a result by $robot->parts->name. That's a bad example because it's a many to one, and that's really the thought behind a one to one. |
@andresgutierrez usually the eager loading is done through multiple queries, with the use of Eager loading
$results = User::with('avatar', 'posts')->find_many();
//will use 3 querys to fetch the users and the relationships:
SELECT * FROM user
SELECT * FROM avatar WHERE user_id IN (....)
SELECT * FROM posts WHERE user_id IN (....)
// so you already have all the data in place, ready to be used
foreach($results as $result){
echo $result->avatar->img;
foreach($result->posts as $post){
echo $post->title;
}
} Working over the example: If your abstraction sets the indexes of the users resultset as the Of course, you can calculate the number of queries if I have 100 users with 100 avatars and 100 post each one. With eager loading there is only 3 queries. And there is no memory wasted, I need that data, I will show it on a Json or a web page. The advantages are what I expect of programming: to go through already implemented abstractions so I don't need to rewrite it every time. In this case I prefer to define wich relationships I want to fetch and I want to fetch them easily, with the less queries to database and having them in memory to use them when I want. Imagine if I want to get users with posts and from each post the tags related.... In phalcon I would need to write 3 foreachs to fetch the relationships. Not to mention deeper relations.
Of course eager loading is to use it as another tool, it does not eliminate the need to think. In the case you expose maybe you don't need to use it. (but imagine that the comments entity does have any kind of domain logic or behaviour needed to get the text... then the use of eager will be an advantage. On Web each time I need data I know exactly wich data I need, let me post a quote from here http://blog.bemycto.com/software-architecture/2015-05-17/doctrine-orm-not-suited-php/
Eloquent does it in a great way: http://laravel.com/docs/5.1/eloquent-relationships#eager-loading and if you read the documentation you can understand the need of any ORM to have it. Nested eager loading is where the abstraction really shines. Of course, there is advantages on lazy loading for other purposes as background process with high memory consumption. @andresgutierrez There is a great implementation here https://github.com/stibiumz/phalcon.eager-loading althought it still needs more developing and would be great to sync with phalcon to solve some drawbacks. @sszdh disadvantages? I don't understand what are the disadvantages of making less database queries through an intelligent management of the database. Let me say that when I work with objects I hate to work with lower abstractions like arrays or database queries. (usually that's my problem with legazy code). The same paradigm I use when I need to work with sql or PHQL.... I don't want data, I want entities with behaviours and domain logic not arrays that I need to fix on each iteration. P.D. sorry for the rush :p I'm really used to eager loading and I would like to see it on phalcon, because when you use it and understand it you know when to make use of it and you need it. It is just another tool and would be great to have it. |
@Surt, +1 |
@Surt, +1 (Clap, clap) |
CakePHP has params called "recursive" for this. Correct me if I'm wrong. does this code produces selects in db? foreach ($this->Robots as $robot) {
foreach ($robot as $parts) { // SELECT * FROM parts WHERE robot_id = $robot->id
foreach ($parts as $part) {
echo $part->name;
}
}
} |
IN queries@Surt That implementation is only suitable for small/medium resultsets (maybe large depending on the values). If you fetch the parent records (in this case users), then you have to traverse this resultset to obtain the "unique" user IDs (found across few, dozens, hundred, thousands or records?). Then these IDs must to be bound to the IN operator to obtain the list. This means keep in memory the list of IDs, then create and send a large SQL statement to the database (ie. using 50 values): SELECT `robots_parts`.`id`, `robots_parts`.`robots_id`, `robots_parts`.`parts_id`
FROM `robots_parts` WHERE `robots_parts`.`robots_id` IN (:APR0, :APR1, :APR2, :APR3, :APR4, :APR5, :APR6, :APR7, :APR8, :APR9, :APR10, :APR11, :APR12, :APR13, :APR14, :APR15, :APR16, :APR17, :APR18, :APR19, :APR20, :APR21, :APR22, :APR23, :APR24, :APR25, :APR26, :APR27, :APR28, :APR29, :APR30, :APR31, :APR32, :APR33, :APR34, :APR35, :APR36, :APR37, :APR38, :APR39, :APR40, :APR41, :APR42, :APR43, :APR44, :APR45, :APR46, :APR47, :APR48, :APR49) While I'm pretty sure this should be faster than individual SELECTs, some database systems start having problems with thousands of values passed in an IN operator:
A database like PostgreSQL always create a temporary table to store the values in the IN list to make the comparison using an inner join (which seems to be a good idea). A database like Oracle transform the values in OR comparisons and also it doesn't allow more than 1000 items in an IN clause (to fix it requires split the values into ORs of 1000 items each one).
This variation in behaviors in which large list of values in an IN operator is handled lead me to think CachingRegarding caching, eager loading always retrieves the values from the database as I see. There's no way to cache related records (unless we think up a way to do that) and retrieve data from the database engine instead of a cache backend (Redis, Memcached, APC, etc) can impact performance (or at least it loses the opportunity to improve it). For instance a customer has a location (id, city, country, etc) this is stored in a second table: $robots = Customers::query()
->with("location") // always hit the database
->execute();
foreach ($customers as $customer) {
// ...
} Locations never change and they could be stored in a fastest cache backend reducing unnecessary access to relational databases. Only the parent record can be cached, if a caching policy is attached to the query it does not necessary apply to the related records. HydrationIn eager loading, related records are fully fetched in memory, this means every request needs to allocate more memory: $users = Users::query()
->with("posts")
->execute();
foreach ($users as $user) { // fetch one user from the database + n posts
foreach ($user->posts as $post) {
// ...
}
} In the other hand using lazy loading we have just one user + one post at the same time in memory: $users = Users::find();
foreach ($users as $user) { // fetch one user from the database
foreach ($user->posts as $post) { // fetch one post from the database
// ...
}
} Of course, this depends on the number of related records fetched in associated relationships. Just added this part to know if you are aware of this. My conclusionI can go ahead and implement this feature using IN queries however it will not be perfect and a solution to everyone's problem. As I mentioned before, IMHO, the best option is always fetch exactly what you need using PHQL or the QueryBuilder (currently available in Phalcon). |
@valVk if I don't remember it wrong CakePHP fetch relationships with eager loading by default, unless you specify lazy loading, so, in your example it does not produce a query for each iteration. In the other hand it uses JOINS instead of IN when it's possible, but I think that it's really difficult to accomplish in terms of complexity. IN queriesUsually on web applications or APIs there is a need for pagination of results. If you make a query with thousands or records you probably are doing something wrong. Hydration
Most of the time, on web pages applications you will take those user + post and stash into a variable to use on the template, or for example, to write a JSON. So in the end your memory consumption is the same => all users with posts on a variable. On the other hand I'm really afraid when I look at the Database queries log and I see thousand of queries. I'm afraid of responsiveness, concurrency... Actually, the https://github.com/stibiumz/phalcon.eager-loading (added to the phalcon incubator) is doing a great job, but it can be merged with phalcon to fully develop its potential. For example, if I don't remember it wrong, the new relationship with conditions on the definition does not work with the eager loading functionality) |
In my case I was writing a API using phalcon and using the "concept" of resources, which is a model on Phalcon. On my approach, I had the following syntax to fetch a resource: http://ENDPOINT/resource_name?parameters As the resources have the same superclass (Phalcon\Model) I can write this to fetch the data dynamically using a generic code, something like that public function handleResource($resourceName)
{
$resource = new $resourceName($someAdditionalParam);
$resource-> ...... (get the query from the parameters in url)
echo json_encode($resource->toList());
} This is just an example and doesn't reflect actual phalcon code! Now, my problem is: lets suppose I need to fetch the Books resource, but I only need the books that was written by George R. R. Martin, so my URL should be something like that http://ENDPOINT/books?q=author.name=George%20R.%20.%20Martin I know I could do something like Author::FindFirst('name=blablalba')->books, but I will have to change a lot of things on my resource handler and, in my humble opinion, looks semantically wrong to fetch Author to find the authors to only then fetch the book, since the straight forward idea is find the books from this author. |
@williamokano You can always use a join to fetch all the data in a single query |
+1 for this. My need for this is only for climbing the parent chain of a record. Eg: $grandChild->$parent->$grandParent results in 3 SQL queries now: (assuming the ids of each is 1)
I have a need to check the entire chain sometimes, not just the grandParent or grandChild, but the parent as well. I would much rather do this once as follows: SELECT grandChild.*, parent.*, grandParent.*
FROM GrandChild as grandChild
JOIN Parent as parent
on parent.id = grandChild.parent_id
JOIN GrandParent as grandParent
on grandParent.id = parent.grandParent_id
WHERE grandChild.id = 1 Obviously all an abstraction, but that's the general idea. If I run that now with PHQL it will return a Row instead of a ResultSet or Model with the relations set. |
+1 for this feature from box |
+1 |
1 similar comment
+1 |
👎 |
I agree. |
Well another eager loading issue closed. Anyway the https://github.com/stibiumz/phalcon.eager-loading works well and we can work with zephir to a lower level implementation. |
You can't just dismiss this out of hand Sergey. I don't disagree that most implementations of eager loading in frameworks are memory hogging abominations because, like stibiumz's implementation, they just pull in the entire model. But I assure you that with careful consideration to performance, an imementation could be done in Phalcon that is no more memory intensive or detrimental to performance that a query builder with joins. |
You could say relational databases in general are a bad idea. However, many of us are using SQL variants with success. We use relationships. We do not get the best performance, but we are not programming in ASM either. For us (probably 90% of phalcon users), eager loading makes sense. Just a quick note. The current implementation of ResultSet makes it VERY difficult to do eager loading manually. Because of the way it does not store all results in memory, it is impossible to add in the eager loaded data to the result set array. I myself use the stibiumz project linked to above but it is buggy and no longer in active development... |
Just whats the problem with joins ? |
So far I have only been able to use joins as conditionals in queries, I have not been able to retrieve data from the joined table. Am I missing something? |
I guess you are missing modelsManager ? In my project i normally retrieve data from some table and joined table in one query(instruction). |
Make sure you select the columns of the joined tables as well, by default they are not selected. |
Aha, I see. This returns the Complex resultset... This is possible but could be a lot of work. I do not understand why eager loading is a bad idea. Hibernate (java) allows you to eager load with a separate SELECT or with a JOIN. I think that would be an excellent model to work off of. Just because it may negatively impact performance does not mean it would not be helpful. |
You can already do that: |
Once we use model, we always define relationships between models.
In most cases, we need to implement a list view, displaying objects with its foreign key field's value.
This will cause a N+1 selects problem (http://use-the-index-luke.com/sql/join/nested-loops-join-n1-problem), slowing down the performance.
So I think it is necessary to implement the eager loading feature of Phalcon\Mvc\Model.
The text was updated successfully, but these errors were encountered: