Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some querys that has DISTINCT and ORDER BY should be invalid #4254

Closed
zz-jason opened this issue Aug 20, 2017 · 4 comments
Closed

some querys that has DISTINCT and ORDER BY should be invalid #4254

zz-jason opened this issue Aug 20, 2017 · 4 comments
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/compatibility type/enhancement The issue or PR belongs to an enhancement.

Comments

@zz-jason
Copy link
Member

zz-jason commented Aug 20, 2017

1. What did you do?

drop table if exists t;
create table t(a bigint, b bigint, c bigint);
insert into t values(1, 2, 1), (1, 2, 2), (1, 3, 1), (1, 3, 2);
select distinct a, b from t order by c;

To order the result, duplicates must be eliminated first. But to do so, which row should we keep ? This choice influences the retained value of c, which in turn influences ordering and makes it arbitrary as well.

In MySQL, a query that has DISTINCT and ORDER BY is rejected as invalid if any ORDER BY expression does not satisfy at least one of these conditions:

  • The expression is equal to one in the select list
  • All columns referenced by the expression and belonging to the query's selected tables are elements of the select list

2. What did you expect to see?

MySQL > select distinct a, b from t order by c;
ERROR 3065 (HY000): Expression #1 of ORDER BY clause is not in SELECT list, references column 'test.t.c' which is not in SELECT list; this is incompatible with DISTINCT

3. What did you see instead?

TiDB > select distinct a, b from t order by c;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
|    1 |    3 |
+------+------+
2 rows in set (0.00 sec)
TiDB > desc select distinct a, b from t order by c;
+---------------+--------------+---------------+------+--------------------------------------------------------------------------------------------------------------+-------+
| id            | parents      | children      | task | operator info                                                                                                | count |
+---------------+--------------+---------------+------+--------------------------------------------------------------------------------------------------------------+-------+
| TableScan_7   | HashAgg_6    |               | cop  | table:t, range:(-inf,+inf), keep order:false                                                                 |     4 |
| HashAgg_6     |              | TableScan_7   | cop  | type:complete, group by:test.t.a, test.t.b, funcs:firstrow(test.t.a), firstrow(test.t.b), firstrow(test.t.c) |     1 |
| TableReader_9 | HashAgg_8    |               | root | data:HashAgg_6                                                                                               |     1 |
| HashAgg_8     | Sort_4       | TableReader_9 | root | type:final, group by:, , funcs:firstrow(col_0), firstrow(col_1), firstrow(col_2)                             |     1 |
| Sort_4        | Projection_5 | HashAgg_8     | root | test.t.c:asc                                                                                                 |     1 |
| Projection_5  |              | Sort_4        | root | test.t.a, test.t.b                                                                                           |     1 |
+---------------+--------------+---------------+------+--------------------------------------------------------------------------------------------------------------+-------+
6 rows in set (0.00 sec)

4. What version of TiDB are you using (tidb-server -V)?

$./bin/tidb-server -V
Git Commit Hash: a0017eda04a1d48e9ec088457afe279a8cd064f4
UTC Build Time:  2017-08-19 01:35:40
@zz-jason zz-jason added type/compatibility type/enhancement The issue or PR belongs to an enhancement. todo labels Aug 20, 2017
@zz-jason zz-jason self-assigned this Aug 20, 2017
@winoros
Copy link
Member

winoros commented Dec 22, 2017

mysql> select distinct a, b from t order by c;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
|    1 |    3 |
+------+------+
2 rows in set (0.00 sec)

Mysql version: Server version: 5.7.19 Homebrew

@zimulala
Copy link
Contributor

This specific behavior depends on sql_mode.

mysql> select @@sql_mode;
+--------------------+
| @@sql_mode         |
+--------------------+
| ONLY_FULL_GROUP_BY |
+--------------------+
1 row in set (0.00 sec)
mysql> select distinct a, b from t order by c;
ERROR 3065 (HY000): Expression #1 of ORDER BY clause is not in SELECT list, references column 'test.t.c' which is not in SELECT list; this is incompatible with DISTINCT

mysql> set @@sql_mode="";
Query OK, 0 rows affected (0.00 sec)

mysql> select @@sql_mode;
+------------+
| @@sql_mode |
+------------+
|            |
+------------+
1 row in set (0.00 sec)

mysql> select distinct a, b from t order by c;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
|    1 |    3 |
+------+------+
2 rows in set (0.00 sec)

@zimulala zimulala added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 20, 2018
@morgo
Copy link
Contributor

morgo commented Dec 10, 2018

Confirming that this issue is still present:

mysql> drop table if exists t;
Query OK, 0 rows affected (0.01 sec)

mysql> create table t(a bigint, b bigint, c bigint);
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t values(1, 2, 1), (1, 2, 2), (1, 3, 1), (1, 3, 2);
Query OK, 4 rows affected (0.00 sec)

mysql> select distinct a, b from t order by c;
+------+------+
| a    | b    |
+------+------+
|    1 |    2 |
|    1 |    3 |
+------+------+
2 rows in set (0.00 sec)

@morgo
Copy link
Contributor

morgo commented Nov 3, 2021

This is no longer the case, so I think we can close this issue:

tidb>  select distinct a, b from t order by c;
ERROR 3065 (HY000): Expression #1 of ORDER BY clause is not in SELECT list, references column 'test.t.c' which is not in SELECT list; this is incompatible with DISTINCT
tidb> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v5.2.1
Edition: Community
Git Commit Hash: cd8fb24c5f7ebd9d479ed228bb41848bd5e97445
Git Branch: heads/refs/tags/v5.2.1
UTC Build Time: 2021-09-08 02:32:56
GoVersion: go1.16.4
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)

However, the detection is not perfect. See: #29418 -- we can leave this new bug open as it is a different case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/compatibility type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

5 participants