Open
Description
What happens?
I tested duckdb v0.9.2 3c695d7ba9 with latest
linux_amd64_gcc4-extensions (https://github.com/duckdb/duckdb_mysql/actions/runs/6863109179
linux_amd64_gcc4-extensions) and mysql 8.0.35.
Currently performance in duckdb is lower compared to mysql client. Query results are correct, utf8mb4 works!
I tested:
select * from mysql.t2 where c1 = 1000000;
select count(*) from mysql.t2;
select count(c1) from mysql.t2;
LOAD './mysql_scanner.duckdb_extension';
ATTACH 'host=127.0.0.1 user=root password=root port=3306 database=test' AS mysql (TYPE mysql_scanner);
D describe mysql.t2;
┌─────────────┬─────────────┬─────────┬─────────┬─────────┬───────┐
│ column_name │ column_type │ null │ key │ default │ extra │
│ varchar │ varchar │ varchar │ varchar │ varchar │ int32 │
├─────────────┼─────────────┼─────────┼─────────┼─────────┼───────┤
│ c1 │ INTEGER │ NO │ │ │ │
│ c2 │ VARCHAR │ YES │ │ │ │
│ c3 │ TIMESTAMP │ YES │ │ │ │
└─────────────┴─────────────┴─────────┴─────────┴─────────┴───────┘
Run Time (s): real 0.111 user 0.077912 sys 0.003893
D select * from mysql.t2 where c1 = 1000000;
100% ▕████████████████████████████████████████████████████████████▏
┌─────────┬──────────┬─────────────────────┐
│ c1 │ c2 │ c3 │
│ int32 │ varchar │ timestamp │
├─────────┼──────────┼─────────────────────┤
│ 1000000 │ öäüßвгдж │ 2023-11-15 02:47:57 │
└─────────┴──────────┴─────────────────────┘
Run Time (s): real 3.149 user 0.901230 sys 0.061170
D select * from mysql.t2 where c1 = 0;
100% ▕████████████████████████████████████████████████████████████▏
┌───────┬──────────────────────────────┬─────────────────────┐
│ c1 │ c2 │ c3 │
│ int32 │ varchar │ timestamp │
├───────┼──────────────────────────────┼─────────────────────┤
│ 0 │ 😃🌀🌁🌂🌃🌄🌅🌆🌇🌈🌉🌊🌋🌌 │ 2023-11-15 03:12:51 │
└───────┴──────────────────────────────┴─────────────────────┘
Run Time (s): real 3.136 user 0.858150 sys 0.109318
select count(*) from mysql.t2;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 1048569 │
└──────────────┘
Run Time (s): real 0.431 user 0.094944 sys 0.011416
D select count(c1) from mysql.t2;
┌───────────┐
│ count(c1) │
│ int64 │
├───────────┤
│ 1048569 │
└───────────┘
Run Time (s): real 1.982 user 0.482755 sys 0.002354
D explain select * from mysql.t2 where c1 = 1000000;
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Physical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│ FILTER │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ (c1 = 1000000) │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 1 │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ MYSQL_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t2 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ c1 │
│ c2 │
│ c3 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 1 │
└───────────────────────────┘
Run Time (s): real 0.007 user 0.003512 sys 0.000372
with mysql client:
mysql> show create table t2\G
*************************** 1. row ***************************
Table: t2
Create Table: CREATE TABLE `t2` (
`c1` int NOT NULL,
`c2` varchar(255) DEFAULT NULL,
`c3` datetime DEFAULT NULL,
PRIMARY KEY (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
mysql> select * from t2 where c1=1000000;
+---------+------------------+---------------------+
| c1 | c2 | c3 |
+---------+------------------+---------------------+
| 1000000 | öäüßвгдж | 2023-11-15 02:47:57 |
+---------+------------------+---------------------+
1 row in set (0.00 sec)
mysql> select count(*) from t2;
+----------+
| count(*) |
+----------+
| 1048569 |
+----------+
1 row in set (0.07 sec)
mysql> select count(c1) from t2;
+-----------+
| count(c1) |
+-----------+
| 1048569 |
+-----------+
1 row in set (0.07 sec)
select c1 from t2 USE INDEX () where c1 = 1000000;
+---------+
| c1 |
+---------+
| 1000000 |
+---------+
1 row in set (0.25 sec)
mysql> select * from t2 where c1 = 0;
+----+----------------------------------------------------------+---------------------+
| c1 | c2 | c3 |
+----+----------------------------------------------------------+---------------------+
| 0 | 😃🌀🌁🌂🌃🌄🌅🌆🌇🌈🌉🌊🌋🌌 | 2023-11-15 03:12:51 |
+----+----------------------------------------------------------+---------------------+
1 row in set (0.00 sec)
mysql> explain select * from t2 where c1=1000000;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t2 | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
It would be great if performance could be optimized a bit for primary key selects!
Copying data from mysql to duckdb was very fast, much faster than copying inside of mysql!
D create table t4 (c1 int, c2 varchar(255), c3 datetime);
Run Time (s): real 0.009 user 0.000372 sys 0.000038
D insert into t4 select * from mysql.t2;
Run Time (s): real 3.933 user 2.549588 sys 0.162101
D select count(*) from t4;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 1048570 │
└──────────────┘
Run Time (s): real 0.001 user 0.001245 sys 0.000123
mysql> create table t3 like t2;
Query OK, 0 rows affected (0.07 sec)
mysql> insert into t3 select * from t2;
Query OK, 1048570 rows affected (11.52 sec)
Records: 1048570 Duplicates: 0 Warnings: 0
To Reproduce
Create a table with 3 columns (int, varchar, datetime) and fill it with 1m rows.
Execute queries to select from primary key and run a count(*).
OS:
Ubuntu 22.04.3
MySQL Version:
8.0.35
DuckDB Version:
0.9.2
DuckDB Client:
cli
Full Name:
Thomas Bley
Affiliation:
myself
Have you tried this on the latest main
branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
Metadata
Metadata
Assignees
Labels
No labels