Skip to content

Counting groupBy() queries returns wrong result #28931

Closed
@tomlankhorst

Description

@tomlankhorst
  • Laravel Version: 5.8.24
  • PHP Version: 7.3.5

Closed duplicates: #4306 #14123

Description:

Imagine a table bakeries which has an Eloquent model Bakery.

id city_id
1 40
2 50
3 50

A query builder's count() method uses SQL COUNT() to determine the row count.

SELECT COUNT(*) AS aggregate FROM bakeries

That will yield the correct row number: 3

However, when using groupBy('city_id'), the following query will be built:

SELECT COUNT(*) AS aggregate FROM bakeries GROUP BY city_id

As COUNT() is an aggregate function, it will be evaluated for every group. So the result in this case will be: [1, 2]. Then, count() will just pick the first result and return that.

... but this is not the count that users expect.

// assumption: 
assert($query->count() === count($query->get()));

Steps To Reproduce:

Bakery::get(); // returns 3 rows
Bakery::count(); // returns 3

Bakery::groupBy('city_id')->get(); // returns 2 rows
Bakery::groupBy('city_id')->count(); // returns 1! Should return 2. 

Possible fix:

To determine the number of resulting records of a group by query, we could use:

  1. subqueries
  2. window functions
  3. distinct count

of which the last is most wide-spread supported and compatible with the current query builder.

The implementation might look as follows:

--- vendor/laravel/framework/src/Illuminate/Database/Query/Builder.php	(date 1561315905000)
+++ vendor/laravel/framework/src/Illuminate/Database/Query/Builder.php	(date 1561315905000)
@@ -2450,7 +2451,15 @@
      */
     public function count($columns = '*')
     {
-        return (int) $this->aggregate(__FUNCTION__, Arr::wrap($columns));
+        $groups = $this->groups;
+
+        $query = $this->cloneWithout(['groups']);
+
+        if (!empty($groups) && $columns === '*') {
+            $columns = DB::raw('DISTINCT '.$this->implode(', ', $groups));
+        }
+
+        return (int) $query->aggregate(__FUNCTION__, Arr::wrap($columns));
     }
 
     /**

I'm not saying this is the way to go, just want to have a discussion going about the best solution to this situation.

Yes, this breaks. For now, I feel like we should add a note in the documentation. Something along the lines of:

Counting a groupBy()? drop the groupBy() call and use count(DB::raw('distinct group_col'))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions