process cop request by tikv instance instead of region #19381
Description
Description
Feature Request
Is your feature request related to a problem? Please describe:
Let's say we have an order
table, whose schema is:
create table order (
id bigint primary key auto_random, -- to avoid write hotspot
customer_id bigint,
...,
index idx_customer_id(customer_id)
);
A query like this:
select * from order where customer_id = ?;
As you can imagine, an optimal execution plan is to index lookup: scan all the id
(note the int primary key in this table) based on the index idx_customer_id
, then scan the table based on the obtained id
. Since id
is randomly generated, the obtained id
s may cross lots of TiKV regions.
I've seen a user case that there are ~800 rows in total, which are placed in ~800 regions, to gather all the data, we need ~800 cop requests in total for all the involved region.
Describe the feature you'd like:
Typically, the number of regions is much larger than the number of tikv instances. If we can process cop requests by tikv instance instead of region, the number of cop requests can be greatly reduced, the performance can also be improved, because:
- the total network time between TiDB and TiKV can be reduced.
- the reduced cop task number can result in reduced cop wait time in a high QPS application.
The query optimizer may also need to be modified to adapt to this runtime optimization. For example, if the user queries the TopN on a table. That TopN
operator may not be pushed down to TiKV coprocessor if the query optimizer thinks that a single TiKV region can not return more than N
rows. But under the new cop task process mechanism, the query optimizer must estimate whether a TiKV instance can return more than N
rows, not a single TiKV region.
As for the optimization itself, since the original region may be split or its leader may be transferred to another TiKV instance, the region-miss error and it's retry strategy need to be carefully designed. We may push pipeline-breaks which need to consume all the results of their child operators to complete their execution, like TopN
and Hash Aggregate
. So it's better to handle the region-miss error and retry other TiKV instances on the TiKV side. Make the TiKV instance guarantee to return the result if no error happens.
Describe alternatives you've considered:
N/A
Teachability, Documentation, Adoption, Migration Strategy:
N/A
SIG slack channel
Score
- 300