Skip to content

"ORDER BY" expressions behaviour compatible with MySQL? #499

@tomwilkie

Description

@tomwilkie

Running the following on MySQL 8.0.25 (on a Mac) works, but fails on the in-memory server:

mysql> CREATE DATABASE test;
Query OK, 1 row affected (0.00 sec)

mysql> USE test;
Database changed

mysql> CREATE TABLE test (
    ->     time TIMESTAMP,
    ->     value DOUBLE
    -> );
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO test VALUES
    ->    ("2021-07-04 10:00:00", 1.0),
    ->    ("2021-07-03 10:00:00", 2.0),
    ->    ("2021-07-02 10:00:00", 3.0),
    ->    ("2021-07-01 10:00:00", 4.0);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT
    ->   UNIX_TIMESTAMP(time) DIV 60 * 60 AS "time",
    ->   avg(value) AS "value"
    -> FROM test
    -> GROUP BY 1
    -> ORDER BY UNIX_TIMESTAMP(time) DIV 60 * 60;
+------------+-------+
| time       | value |
+------------+-------+
| 1625130000 |     4 |
| 1625216400 |     3 |
| 1625302800 |     2 |
| 1625389200 |     1 |
+------------+-------+
4 rows in set (0.01 sec)

Running on 8148809:

mysql> CREATE DATABASE test;
Query OK, 1 row affected (0.00 sec)

mysql> USE test;
Database changed

mysql> CREATE TABLE test (
    ->     time TIMESTAMP,
    ->     value DOUBLE
    -> );
Empty set (0.00 sec)

mysql> INSERT INTO test VALUES
    ->    ("2021-07-04 10:00:00", 1.0),
    ->    ("2021-07-03 10:00:00", 2.0),
    ->    ("2021-07-02 10:00:00", 3.0),
    ->    ("2021-07-01 10:00:00", 4.0);
Query OK, 4 rows affected (0.00 sec)

mysql> SELECT
    ->   UNIX_TIMESTAMP(time) DIV 60 * 60 AS "time",
    ->   avg(value) AS "value"
    -> FROM test
    -> GROUP BY 1
    -> ORDER BY UNIX_TIMESTAMP(time) DIV 60 * 60;
ERROR 1105 (HY000): unable to sort: incompatible conversion to SQL type: DATETIME

(For background this query is generated by the Grafana MySQL datasource).

Now it seems go-mysql-server is trying to apply UNIX_TIMESTAMP(time) DIV 60 * 60 in the ORDER BY clause to the time column aliased out of the SELECT, and not noticing the the definition of the ORDER BY expression is the same. This is failing as UNIX_TIMESTAMP expects a time.Time and not an int.

If you try and use a different column name and alias it to time (as Grafana expects) the problem might be more obvious:

(On MySQL)

mysql> CREATE DATABASE test;
Query OK, 1 row affected (0.00 sec)

mysql> USE test;
Database changed

mysql> CREATE TABLE test (
    ->     timestamp TIMESTAMP,
    ->     value DOUBLE
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO test VALUES
    ->    ("2021-07-04 10:00:00", 1.0),
    ->    ("2021-07-03 10:00:00", 2.0),
    ->    ("2021-07-02 10:00:00", 3.0),
    ->    ("2021-07-01 10:00:00", 4.0);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT
    ->   UNIX_TIMESTAMP(timestamp) DIV 60 * 60 AS "time",
    ->   avg(value) AS "value"
    -> FROM test
    -> GROUP BY 1
    -> ORDER BY UNIX_TIMESTAMP(timestamp) DIV 60 * 60;
+------------+-------+
| time       | value |
+------------+-------+
| 1625130000 |     4 |
| 1625216400 |     3 |
| 1625302800 |     2 |
| 1625389200 |     1 |
+------------+-------+
4 rows in set (0.00 sec)

(On go-sql-server):

mysql> CREATE DATABASE test;
Query OK, 1 row affected (0.00 sec)

mysql> USE test;
Database changed
mysql> CREATE TABLE test (
    ->     timestamp TIMESTAMP,
    ->     value DOUBLE
    -> );
Empty set (0.00 sec)

mysql> INSERT INTO test VALUES
    ->    ("2021-07-04 10:00:00", 1.0),
    ->    ("2021-07-03 10:00:00", 2.0),
    ->    ("2021-07-02 10:00:00", 3.0),
    ->    ("2021-07-01 10:00:00", 4.0);
Query OK, 4 rows affected (0.00 sec)

mysql> SELECT
    ->   UNIX_TIMESTAMP(timestamp) DIV 60 * 60 AS "time",
    ->   avg(value) AS "value"
    -> FROM test
    -> GROUP BY 1
    -> ORDER BY UNIX_TIMESTAMP(timestamp) DIV 60 * 60
    -> ;
ERROR 1105 (HY000): column "timestamp" could not be found in any table in scope

Its failing as the output of the SELECT doesn't have a timestamp column (its been renamed to time).

I'm not sure what the "correct" semantics should be - the reference docs seem to imply if should only column names, aliases and positions:

Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1:

But then go on to say expressions are allowed:

MySQL resolves unqualified column or alias references in ORDER BY clauses by searching in the select_expr values, then in the columns of the tables in the FROM clause. For GROUP BY or HAVING clauses, it searches the FROM clause before searching in the select_expr values. (For GROUP BY and HAVING, this differs from the pre-MySQL 5.0 behavior that used the same rules as for ORDER BY.)

I wonder if, for Sort nodes, the order the columns are indexes in the resolver needs to be reversed?

https://github.com/dolthub/go-mysql-server/blob/master/sql/analyzer/resolve_columns.go#L482

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions