Skip to content

Wrong results when grouping on a column with NULLs #782

Closed
@alamb

Description

@alamb

Describe the bug
Given a column with null values, if you group by that column the null values are lost

To Reproduce
Input:

 c1 
----
  0
  3
   
  1
  3
(5 rows)

And run SELECT COUNT(*), c1 FROM test GROUP BY c1

Actual result

        "+-----------------+----+",
        "| COUNT(UInt8(1)) | c1 |",
        "+-----------------+----+",
        "| 2               | 3  |",
        "| 2               | 0  |", <-- NOTE there is only a single actual value of 0 in the input (the other is NULL)
        "| 1               | 1  |",
        "+-----------------+----+",

Expected behavior
Here is the correct answer according to postgres (note the row for c1 is NULL):

alamb=# SELECT COUNT(*), c1 FROM test GROUP BY c1;
 count | c1 
-------+----
     1 |   
     2 |  3
     1 |  0
     1 |  1
(4 rows)

Additional context
Discovered while playing around with #781

Here is the entire reproducer in postgres

alamb=# drop table test;
DROP TABLE
alamb=# create table test (c1 int);
CREATE TABLE
alamb=# insert into test values(0);
INSERT 0 1
alamb=# insert into test values(3);
INSERT 0 1
alamb=# insert into test values(NULL);
INSERT 0 1
alamb=# insert into test values(1);
INSERT 0 1
alamb=# insert into test values(3);
INSERT 0 1
alamb=# SELECT COUNT(*), c1 FROM test GROUP BY c1;
 count | c1 
-------+----
     1 |   
     2 |  3
     1 |  0
     1 |  1
(4 rows)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions