Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

restore: less split during restoring #1377

Closed
wants to merge 7 commits into from

Conversation

YuJuncen
Copy link
Collaborator

What problem does this PR solve?

Partially fix #1374

What is changed and how it works?

before:
--|-------t1 data-------|-----|---t2 data-------|
after: 
----------t1 data-------|---------t2 data-------|

Legends:
'|' the split point
'-' the key space

Also, downloadSST would find rewrite rules by file instead of region start key for now.

Check List

Tests

image

Release note

  • BR would split less regions to reduce the number of empty region after restoration.

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@3pointer
Copy link
Collaborator

LGTM, but I think we need more tests on this fix. @fubinzh

@YuJuncen
Copy link
Collaborator Author

YuJuncen commented Aug 4, 2021

A Note: This optimization assumes all files only contain one table or one index. If this precondition violated, a data lost may happen.

When we have data keys and index keys in the same file(for example, t1_i1_d and t1_r1): in the past, we should split at t1_i and t1_r and import this file twice both at [t1_i1, t1_r) and [t1_r, ...), which sounds inefficient but did the right thing. With this change, things get wrong: only the part of [t1_i1, t1_r) could be imported, and the t1_r1 part wouldn't be rewritten, or even worse, an unordered SortedStringTable file would be ingested.

This defect wouldn't affect the current BR because we create backup files for each indices and data separately. But we did need to change the way we validate files.

@ti-chi-bot
Copy link
Member

@YuJuncen: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@YuJuncen
Copy link
Collaborator Author

close it due to pingcap/tidb#27240

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many empty region after restoration of many small tables
3 participants