Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

Too many empty region after restoration of many small tables #1374

Closed
YuJuncen opened this issue Jul 20, 2021 · 2 comments · Fixed by pingcap/tidb#27240
Closed

Too many empty region after restoration of many small tables #1374

YuJuncen opened this issue Jul 20, 2021 · 2 comments · Fixed by pingcap/tidb#27240

Comments

@YuJuncen
Copy link
Collaborator

YuJuncen commented Jul 20, 2021

  1. What did you do?
    Restore a workload with 190G, 6000 tables, according to the backup meta, there are about 42K files backed up.

  2. What did you expect to see?
    Restore done, and the region count should be less than 42K (or we nearly must get empty regions), and the number of empty region should be a reasonable value.

  3. What did you see instead?
    See figures below.

Figure 1. too many unhealthy regions
image
Figure 2. too many regions
image

  1. What version of BR and TiDB/TiKV/PD are you using?

Nightly (2020-7-20)

  1. Operation logs

(The log is too large to be updated... Too many write conflicts with crating tables concurrently...)

@YuJuncen YuJuncen added type/bug Something isn't working component/restore labels Jul 20, 2021
@YuJuncen
Copy link
Collaborator Author

Regions with size less than or equal to 1M bytes would be treated as an empty region.

According the log from TiKV, with the workload of many-small-tables, the reported table size is very small (only 2 of 10 regions are greater than 1M):

image

@YuJuncen
Copy link
Collaborator Author

YuJuncen commented Jul 21, 2021

Currently, We split region in two class of keys: the new key of rewrite rules (t{new_table_id}) and the end key of each files. Generally, this did two things:

  1. Split at the start and the last key backed up of the table, so there isn't any records from other tables share the region with these records.
0|-----|----------------------|---------|∞
       ^                      ^(t{new_table_id}_r{last_record_key_backed_up})
       +(t{new_new_table_id})
  1. Then, according to the file size of this table, split internally.
0|-[1]-|-96M-|-96M-|-96M-|-35M|---[2]---|∞
       ^                      ^(t{new_table_id}_r{last_record_key_backed_up})
       (t{new_new_table_id})

This method make best safety: because for performance, each download RPC can take only one rewrite rule, no table overlapping means each download RPC can always choose the rewrite rule for the table (or the index) to restore.

However, this method make more empty regions, in the example, there are two regions [0,t{new_new_table_id})([1]) and [t{new_table_id}_r{last_record_key_backed_up}, ∞)([2]) become empty.

Maybe the last key of each table or the first key can be omitted. In the many-table workload, if the former table has split at the end, then the latter can reuse the region at [2]. and vice versa.

0|-----|-T1--|-T1--|-T1--|-T1----|-T2-|∞
       ^                      ^  ^(t2)
       |                      +(t1_r{last_record_key_backed_up})
       +(t1)

This figure shows an example that after omitting split the last key of each table, the property of each table owns one region can probably be kept.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants