-
Notifications
You must be signed in to change notification settings - Fork 319
Closed
Labels
Improvementcomplex-queryresolveproblem has been fixed by developerproblem has been fixed by developer
Milestone
Description
Sometimes, a complex query are not a real complex query for query plan:
A. For exmaple :
schema.xml
<schema name="testdb">
<table name="sharding_two_node" dataNode="dn1,dn2" rule="two_node_hash"/>
<table name="sharding_two_node2" dataNode="dn1,dn2" rule="two_node_hash"/>
</schema>
rule.xml
<tableRule name="two_node_hash">
<rule>
<columns>id</columns>
<algorithm>two_node_hash</algorithm>
</rule>
</tableRule>
<function name="two_node_hash" class="Hash">
<property name="partitionCount">2</property>
<property name="partitionLength">512</property>
</function>
- steps:
a query is that
explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 and b.id=2;
- expect result:
mysql> explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 and b.id=2;
+-----------+----------+-----------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-----------+----------+-----------------------------------------------------------------------------------------------------------+
| dn1 | BASE SQL | select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 and b.id=2 |
+-----------+----------+-----------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
- real result:
mysql> explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 and b.id=2;
+-----------------+---------------+----------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | TYPE | SQL/REF |
+-----------------+---------------+----------------------------------------------------------------------------------------------------------------------+
| dn1_0 | BASE SQL | select `a`.`id`,`a`.`c_flag`,`a`.`c_decimal` from `sharding_two_node` `a` where a.id = 1 ORDER BY `a`.`c_flag` ASC |
| merge_1 | MERGE | dn1_0 |
| shuffle_field_1 | SHUFFLE_FIELD | merge_1 |
| dn1_1 | BASE SQL | select `b`.`id`,`b`.`c_flag`,`b`.`c_decimal` from `sharding_two_node2` `b` where b.id = 2 ORDER BY `b`.`c_flag` ASC |
| merge_2 | MERGE | dn1_1 |
| shuffle_field_3 | SHUFFLE_FIELD | merge_2 |
| join_1 | JOIN | shuffle_field_1; shuffle_field_3 |
| shuffle_field_2 | SHUFFLE_FIELD | join_1 |
+-----------------+---------------+----------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)
B. Now,we will try to optimize it.
These "Complex Queries" can be sent to one datanode :
- explain select count(*) from sharding_two_node a where a.id =1;
- explain select * from sharding_two_node a join sharding_two_node2 b on a.id = b.id where a.id =1 and b.id=1;
- explain select * from sharding_two_node a join testdb.sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 and b.id=2;
- explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where (a.id =1 and b.id=1) or (a.id =2 and b.id=2);
- explain select * from sharding_two_node a join sharding_two_node2 b on a.id = b.id where a.id =1;
- explain select * from sharding_two_node a join sharding_two_node2 b where a.id = b.id and a.id =1;
- explain select * from sharding_two_node a join sharding_two_node2 b where a.id =b.id and (( a.id =1 and b.id=1) or ( a.c_flag=b.c_flag and a.id =2 ));
- explain select * from sharding_two_node a join sharding_two_node2 b where a.id =b.id and a.c_decimal=1 and (( a.id =1 and b.id=1) or ( a.c_flag=b.c_flag and a.id =2 ));
- explain select * from sharding_two_node a join sharding_two_node2 b where a.id =b.id and (a.c_decimal=1 and (( a.id =1 and b.id=1) or ( a.c_flag=b.c_flag and a.id =2 )));
- explain select * from sharding_two_node where id =1 and c_flag = (select c_flag from sharding_two_node2 where id =1 );
- explain select * from sharding_two_node a where a.id =1 and exists(select * from sharding_two_node2 b where a.c_flag=b.c_flag and b.id =1);
- explain select * from sharding_two_node where id =1 union select * from sharding_two_node2 where id =1 ;
C. These queries are not able to send one datanode:
- explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where a.id =1 or b.id=1;
- explain select * from sharding_two_node a join sharding_two_node2 b on a.c_flag=b.c_flag where (a.id =1 and b.id=1) or (a.id =513 and b.id=513);
- explain select * from sharding_two_node a join sharding_two_node2 b where (a.id = b.id and a.id =1 and b.id=1) or ( a.c_flag=b.c_flag and a.id =2 );
- explain select * from sharding_two_node a join sharding_two_node2 b where a.c_flag=b.c_flag and a.id =2 ;
- explain select * from sharding_two_node a join sharding_two_node2 b where a.id =b.id and (a.c_decimal=1 or (( a.id =1 and b.id=1) or ( a.c_flag=b.c_flag and a.id =2 )));
- explain select * from sharding_two_node where c_flag = (select c_flag from sharding_two_node2 where id =1 );
- explain select * from sharding_two_node where id =1 union select * from sharding_two_node2;
D. if the schema.xml is like this:
<schema name="testdb">
<table name="sharding_two_node" dataNode="dn1,dn2" rule="two_node_hash"/>
</schema>
<schema name="testdb2">
<table name="sharding_two_node2" dataNode="dn1,dn2" rule="two_node_hash"/>
</schema>
These queries should be optimized:
- explain select * from
testdb
.sharding_two_node a join testdb2.sharding_two_node2
b on a.id = b.id where a.id =1 and b.id=1; - explain select * from testdb.sharding_two_node a join testdb2.sharding_two_node2 b on a.id = b.id where a.id =1 and b.id=1;
- use testdb2;explain select * from testdb.sharding_two_node a join sharding_two_node2 b on a.id = b.id where a.id =1 and b.id=1;
- use testdb;explain select * from sharding_two_node a join testdb2.sharding_two_node2 b on a.id = b.id where a.id =1 and b.id=1;
Metadata
Metadata
Assignees
Labels
Improvementcomplex-queryresolveproblem has been fixed by developerproblem has been fixed by developer