Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.x] Run pagination count as subquery for group by and havings #32624

Merged
merged 4 commits into from
May 1, 2020

Conversation

taylorotwell
Copy link
Member

Paginating queries with groupBy or having statements is a long-standing issue in Laravel going back to the very beginning of the framework with literal dozens of raised issues:

#1892, #2761, #3105, #4306, #6985, #7372, #9567, #10632, #14123, #16320, #17406, #22883, #28931

This solution was suggested @acasar years ago but I wrote it off at the time - but honestly I think it's a lot better than what we have now so I'm bringing it up again for consideration.

👀

@taylorotwell
Copy link
Member Author

/cc @staudenmeir 🤷‍♂️

Comment on lines 2260 to 2261
->from(new Expression('('.$this->toSql().') as '.$this->grammar->wrap('aggregate_table')))
->mergeBindings($this)
Copy link
Contributor

@staudenmeir staudenmeir May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
->from(new Expression('('.$this->toSql().') as '.$this->grammar->wrap('aggregate_table')))
->mergeBindings($this)
->from($this, 'aggregate_table')

@reinink
Copy link
Contributor

reinink commented May 1, 2020

Dig this solution! 👍

@yateric
Copy link
Contributor

yateric commented May 1, 2020

How about this?

public function getCountForPagination($columns = ['*'])
{
    if ($this->groups || $this->havings) {
        return $this->newQuery()->from($this)->count();
    }
    ...
}

@staudenmeir
Copy link
Contributor

We also need to remove ORDER BY, LIMIT and OFFSET clauses here as we do now. SQL Server doesn't allow ORDER BY clauses in subqueries like this.

In theory, we could just use the subquery for all cases, but I'm not sure if all databases optimize it the same way.

@reinink
Copy link
Contributor

reinink commented May 1, 2020

In theory, we could just use the subquery for all cases, but I'm not sure if all databases optimize it the same way.

I was thinking the same thing. A quick test in MySQL and Postgres suggests there is no performance hit doing it this way, at least in my trivial example.

@driesvints driesvints changed the title Run pagination count as subquery for group by and havings [7.x] Run pagination count as subquery for group by and havings May 1, 2020
return [(object) ['aggregate' => $this->newQuery()
->from(new Expression('('.$this->toSql().') as '.$this->grammar->wrap('aggregate_table')))
->mergeBindings($this)
->count(['*']), ]];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be faster to count the ID if there's one? instead of counting all columns?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't always know which columns are actually available. There is also a difference between count(*) and count(id): https://stackoverflow.com/a/3003533/4848587

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the $columns param is being passed by not used, maybe it should be
->count($columns), ]];

@taylorotwell taylorotwell merged commit 4469c37 into 7.x May 1, 2020
@GrahamCampbell GrahamCampbell deleted the paginate-group-by-having branch May 1, 2020 18:32
@royduin
Copy link
Contributor

royduin commented May 5, 2020

Awesome! I've added a notice of this in the readme of https://github.com/justbetter/laravel-pagination-with-havings

@hotmeteor
Copy link
Contributor

hotmeteor commented May 5, 2020

This solution ended up breaking existing queries I have that utilize sub-selects and a ->groupBy. Ultimately, all of the bindings were offset incorrectly and resulted in my query having one less binding than it should any time I use ->addSelect or ->selectSub.

I had to revert to 7.9.2 to resolve it.

@reinink
Copy link
Contributor

reinink commented May 5, 2020

@hotmeteor Weird, I wonder why the bindings are being offset incorrectly. Really, this should have no bearing on that, at least from what I can tell, since no new additional bindings were added. When you say "all of the bindings were offset incorrectly", what do you mean exactly? Were there new bindings added the beginning?

@hotmeteor
Copy link
Contributor

@reinink I'm not totally sure, TBH. I'm playing around with it locally.

Take for example this set of bindings (dumped from the Builder class):

array:9 [
  "select" => array:6 [
    0 => "Federal Income Tax"
    1 => "Federal Income Tax"
    2 => "Social Security Tax"
    3 => "Social Security Tax"
    4 => "Medicare Tax"
    5 => "Medicare Tax"
  ]
  "from" => []
  "join" => []
  "where" => array:4 [
    0 => "federal"
    1 => "%FederalIncome%"
    2 => "2018-05-01"
    3 => "2018-05-31"
  ]
  "groupBy" => []
  "having" => []
  "order" => []
  "union" => []
  "unionOrder" => []
]

When the query runs it's like all of the select bindings just get nuked, and all the where bindings are inserted in the wrong spot, resulting in an error like:

SQLSTATE[08P01]: <<Unknown error>>: 7 ERROR:  bind message supplies 4 parameters, but prepared statement "pdo_stmt_0000000c" requires 10

@hotmeteor
Copy link
Contributor

@reinink I figured it out.

The existing return on line 2261 is using the ->count() aggregate method, but that method strips out all of the $columns and selects, which are necessary for the bindings and the group by.

The following does restore functionality, but I haven't yet confirmed if it returns the correct counts:

return [(object) ['aggregate' => $this->newQuery()
                        ->from(new Expression('('.$clone->toSql().') as '.$this->grammar->wrap('aggregate_table')))
                        ->mergeBindings($clone)
                        ->setAggregate('count', $this->withoutSelectAliases($columns))
                        ->get()->all(), ]];

@hotmeteor
Copy link
Contributor

Here's a PR with the suggested fix: #32688

@patrickomeara
Copy link
Contributor

patrickomeara commented May 5, 2020

@hotmeteor I have hit this issue as well.

I'm also getting

SQLSTATE[42S21]: Column already exists: 1060 Duplicate column name 'id' (SQL: select count() as aggregate from (select from post inner join post as child on post.id = child.parent_id where child.foo != bar group by post.id) as aggregate_table)
Post::query()
    ->join('post as child', 'post.id', '=', 'child.parent_id')
    ->where('child.foo', '!=', 'bar')
    ->groupBy('post.id');

Adding ->select('post.*') fixes this issue.

@hotmeteor
Copy link
Contributor

@patrickomeara Hey Patrick. Unfortunately, not the same issue. This particular issue was that all select bindings where being stripped by the count method, but the select query portions were not removed, causing a conflict. Selects are actually necessary for groupBy, so the fix was to retain both the selects and the bindings.

@patrickomeara
Copy link
Contributor

@hotmeteor Sorry my comment wasn't clear. I have hit the issue you have created the PR for. Thanks for that.

Aside from that I am also getting the above issue after updating to 7.10. However, this PR doesn't look like the culprit. I will keep digging. :)

@hotmeteor
Copy link
Contributor

@patrickomeara Ah, got it! My bad... I didn't read carefully.

Did your query work before 7.10? That seems like typically behavior because the join introduces another id from a new table, so you need to be specific.

@patrickomeara
Copy link
Contributor

patrickomeara commented May 5, 2020

Yeah it works with 7.9.2 as it isn't wrapped

7.9.2

select count(*) as aggregate 
from `post` inner join `post` as `child` on `post`.`id` = `child`.`parent_id` 
where `post`.`foo` != 'bar' group by `post`.`id`

7.10

select count(*) as aggregate from (
    select * from `post` 
    inner join `post` as `child` on `post`.`id` = `child`.`parent_id` 
    where `post`.`foo` != 'bar'
    group by `post`.`id`
) as `aggregate_table`

So this PR was the issue, it actually happens on any joins where there is duplicate column name without specifying a table in the select, not just joining itself.

I think this needs to be dealt with, perhaps if no selects are set, a default primary_table.* is set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants