Skip to content

[5.8] Add aggregates eager loading support #26481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

[5.8] Add aggregates eager loading support #26481

wants to merge 6 commits into from

Conversation

storrella
Copy link

This PR adds support to eager load aggregates (sum, min, max, avg and custom ones too).

Some examples:

User::withSum('posts:nlikes')
    ->withMax(['posts:ncomments' => function ($q) {    // Constraints are supported  
        $q->where('nlikes', '>', 50);  
    }])  
    ->withAggregate('stddev_pop', ['posts:nlikes', 'posts:ncomments as comments_stddev'])    // Other aggregate functions are supported too  
    ...

Expected result:

[
...
'posts_nlikes_sum' => ... ,
'posts_ncomments_max' => ... ,
'posts_nlikes_stddev_pop' => ... ,
'comments_stddev' => ... ,
...
]

I also added the ability to automatically eager load an aggregate for every request, like how $withCount works now. Example:

class User extends Model
{
    protected $withAggregates = [
        'sum' => 'posts:nlikes',
        'max' => [
            'posts:nlikes as max_posts_likes',
            'posts:ncomments',
        ],
    ];

    public function posts()
    {
        return $this->hasMany(Post::class);
    }
}

This shouldn't break anything because the only method signature that was changed is protected and has a default value, so only people who overwrote the method \Illuminate\Database\Query\Builder::parseWithRelations (very rare) should have conflicts.

@staudenmeir
Copy link
Contributor

See #25319.

@storrella
Copy link
Author

I think the main difference between that PR and this is the $withAggregates attribute.

The solution should work the same way that withCount does, because most of the code was refactored to use the same methods.

Also, the hack mentioned on the other PR doesn't work well on HasManyThrough relations.

My changes are breaking a test because on the builder is mocked and I added on a call to a function which wrap the column. I have no experience with mocks, so any help with fixing that test would be appreciated. Anyway, I'll try to figure out by myself later today

Copy link
Member

@driesvints driesvints left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the StyleCI failures.

@storrella
Copy link
Author

@driesvints I have fixed the StyleCI failures and the test which was failing. Now it doesn't pass all tests but I checked that 5.7 branch neither pass the same tests

@deleugpn
Copy link
Contributor

deleugpn commented Nov 13, 2018

Changes to method signatures is a breaking change. This should target master.

@driesvints driesvints dismissed their stale review November 13, 2018 11:07

StyleCI failures were fixed.

@driesvints
Copy link
Member

@storrella tests are failing atm because of other reasons. We're having a look at them atm.

@staudenmeir
Copy link
Contributor

Since this is a breaking change anyway: Should we remove the Model::$withCount property? It doesn't really have a purpose anymore (besides being a shortcut).

@deleugpn
Copy link
Contributor

I think it's an unnecessary breakage. I don't think the code will be much better if we deprecate that.

@driesvints
Copy link
Member

@storrella if you'd rebase the branch with the current 5.7 branch the tests will pass again.

@storrella
Copy link
Author

@driesvints I just did it, thanks! Now all tests have passed

@driesvints
Copy link
Member

@storrella seems like something went wrong. Can you please try to fix the commits?

@storrella
Copy link
Author

@driesvints I see everything okay. What's wrong?

@driesvints
Copy link
Member

driesvints commented Nov 20, 2018

@storrella Check the commits. Or hard refresh the page.

screen shot 2018-11-20 at 10 28 59 pm

screen shot 2018-11-20 at 10 29 04 pm

@storrella
Copy link
Author

Sorry @driesvints , it was my bad. It should be okay now

@driesvints
Copy link
Member

@storrella no worries. Glad you figured it out 👍

@staudenmeir
Copy link
Contributor

Did you see #26481 (comment)?

@storrella
Copy link
Author

@staudenmeir The only signature change is on Illuminate\Database\Eloquent:
from:
protected function parseWithRelations(array $relations)
to:
protected function parseWithRelations(array $relations, $selectConstraints = true)

It has a default param with the old behavior.
So, should this PR target master instead of 5.7?

@staudenmeir
Copy link
Contributor

Yes, but if someone overrides the method in their application/package, it breaks their code.

@storrella storrella changed the base branch from 5.7 to master November 20, 2018 22:32
@storrella
Copy link
Author

Done, PR target changed to master

@staudenmeir
Copy link
Contributor

👍 Please also adjust the title: [5.8]

@storrella storrella changed the title [5.7] Add aggregates eager loading support [5.8] Add aggregates eager loading support Nov 20, 2018
Copy link
Contributor

@phroggyy phroggyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, I like the idea, just think it could do with some minor improvements 🙂

*
* @var array
*/
protected $withAggregates = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to remove withCount – it's a bigger breaking change, but I think might make more sense, so we don't leave two ways of doing the same thing – feels like we run the risk of more edge cases leaving it this way

Copy link
Contributor

@deleugpn deleugpn Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be against removing withCount without proper deprecation to give people at least 1 release cycle to work it out slowly. It's a relevant change that could impact packages, tutorials, documentations, a lot of content overall, not just people's application.
Getting aggregates in is something that I've want for a long time now and there's been several attempts already. I'd say let's focus on getting it to work and once it's in and stable we can consider removing withCount separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a deprecation notice, that seems like the only sensible way forward.

return $this->getRelationExistenceQuery(
$query, $parentQuery, new Expression('count(*)')
$query, $parentQuery, new Expression($function.'('.$wrappedColumn.')')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think there's any value in keeping a list of valid functions? So that we can tell the user that aggregate is not supported, otherwise I worry users might end up with domain code that works with one database engine and not another (which can lead to obscure bugs). It would be better that users be very explicit when they choose to use an aggregation that their particular database supports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess the database error message would be clear enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, my concern is database compatibility – if you have to switch from one database to another for some reason, and your new database doesn't support the same aggregation (or worse, for some reason has a slightly different syntax), you'd be in a world of pain. I'm not sure this is anything that might happen, but I did want to bring it up just in case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last year I was part of switching a mid-sized Laravel project from MySQL to Postgres.

I don't remember any Eloquent-related code paths had to be touched. Only MySQL specific SQL statements (💥 ).

To sum it up: this kind of thinking is very much appreciated!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @deleugpn. What's the benefit of keeping a list of valid functions? In other words, how it would improve the user experience?

Having a list of valid functions or not, the error only would be triggered when the query is executed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@storrella if developers have a discrepancy between their production and dev environment, where they might run MySQL 8 locally and MySQL 5.6 in prod, they could run into issues. Those issues would be discovered at time of dev if you had a whitelist. Docker helps with this, but I don't think it's necessarily fair to assume a perfect dev setup either. Just feels like one of those things that's likely to go wrong when we let users input literally anything.

Copy link
Contributor

@deleugpn deleugpn Dec 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we end up whitelisting both the MySQL 8 function as well as the MySQL 5.6 and be back to square 1? I guess like @storrella I'm having difficulties understanding the benefit of this as well since it's on Eloquent side. This isn't on a driver-specific Grammar class, so there's no point in whitelisting anything because you would end up whitelisting every function of every supported driver. Whatever error you get here you would get with your database as well. The difference is we would need to whitelist functions and map to a specific error message. Instead, let's not whitelist anything and let the database explain what went wrong.

@@ -212,13 +279,22 @@ public function withCount($relations)
[$name, $alias] = [$segments[0], $segments[2]];
}

// Determine column to do the aggregate function
$segments = explode(':', $name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need explanation for these 4 lines or so... no idea what the purpose is and comment is not clear at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought the code was self-explanatory. The next lines give us an idea of what they do:

if (count($segments) === 2) {
    [$name, $aggregateColumn] = $segments;
}

The purpose is to extract the relation name and the column name. For example, calling User::withSum('posts:earnings') would mean that we want to retrieve users with a property called post_earnings_sum (that's the default name, it can be changed with AS something) whose value is the earnings column sum of the posts relation

@taylorotwell
Copy link
Member

I would also enjoy a good discussion of: if this is a package already, why should I merge it into the core and take ownership of the maintenance from this point forward?

phroggyy and others added 2 commits December 5, 2018 19:20
….php

Co-Authored-By: storrella <storrella@users.noreply.github.com>
Co-Authored-By: storrella <storrella@users.noreply.github.com>
@deleugpn
Copy link
Contributor

deleugpn commented Dec 5, 2018

My arguments for this in the core are the following:

1- There's no trustworthy package providing this. I don't know how long these packages would keep me from updating.

2- Being used to withCount, its very natural to try and reach for withSum. I have needed it enough to matter.

3- The implementation is not so different from withCount. In fact, its possible to hack withCount to act as anothet aggregator. See #25319 (comment)

It improves the Eloquent experience overall at a seemingly low cost. I have been using the hack since I saw that comment. It's verbose but better than bring in an external dependency.

@storrella
Copy link
Author

3- The implementation is not so different from withCount. In fact, its possible to hack withCount to act as anothet aggregator. See #25319 (comment)

That hack doesn't always work. One reason of this pull request was that it didn't work for me. I'm not 100% sure but I would bet that on hasManyThrough relations it doesn't return the aggregate result because intermediate tables are required and the primary key of these tables are appended to the select.

@taylorotwell
Copy link
Member

Honestly you're mainly just testing the most basic happy path usage. I suggest adding a lot more tests. Do a search on withCount() in the existing tests... there are tests with having conditions, soft deletes, joins, merging with existing wheres, etc.

This is where a lot of PRs introduce bugs - by just testing basic cases.

@storrella
Copy link
Author

Honestly you're mainly just testing the most basic happy path usage. I suggest adding a lot more tests. Do a search on withCount() in the existing tests... there are tests with having conditions, soft deletes, joins, merging with existing wheres, etc.

This is where a lot of PRs introduce bugs - by just testing basic cases.

Thanks for the feedback.
These weeks are very busy for me and I have little time to code. I'll mention you when I add more tests.

@taylorotwell
Copy link
Member

Just re-submit it when you have more tests. I don't like keeping PRs open in a limbo state. Thanks!

@williamcruzme
Copy link
Contributor

Added!

https://laravel.com/docs/8.x/eloquent-relationships#counting-related-models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants