Optimizer performance of APPROX_TOP_K function

## Enhancement

It can be used to calculate top-k from a large dataset quickly, which is expected to be much faster than plain TOP-K.

But actually it's not, and even slower than the GROUP-BY. Calculate top-k from ssb.lineorder:
- GROUP-BY: 1.413s
- APPROX_TOP_K: 54.462s

```sql
MySQL root@127.1:ssb> select lo_orderkey, count(*) from lineorder group by lo_orderkey order by count(*) limit 5;
+-------------+----------+
| lo_orderkey | count(*) |
+-------------+----------+
| 30182       | 4        |
| 25441       | 4        |
| 13888       | 4        |
| 27172       | 4        |
| 1990        | 4        |
+-------------+----------+
5 rows in set
Time: 1.413s
MySQL root@127.1:ssb> select APPROX_TOP_K(lo_orderkey) from lineorder;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| approx_top_k(lo_orderkey)                                                                                                                                    |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| [{"item":5998592,"count":32568},{"item":5988930,"count":32568},{"item":5977440,"count":32568},{"item":5995557,"count":32564},{"item":5991074,"count":32564}] |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
Time: 54.462s

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer performance of APPROX_TOP_K function #51834

Enhancement

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development