Skip to content

Commit

Permalink
BR: reorganize content about BR tool (pingcap#4810) (pingcap#4848)
Browse files Browse the repository at this point in the history
* cherry pick pingcap#4810 to release-5.0

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

* Update backup-and-restore-tool.md

Co-authored-by: Keke Yi <40977455+yikeke@users.noreply.github.com>
Co-authored-by: yikeke <yikeke@pingcap.com>
  • Loading branch information
3 people authored Feb 20, 2021
1 parent 96ff13f commit 58ec534
Show file tree
Hide file tree
Showing 9 changed files with 593 additions and 622 deletions.
9 changes: 5 additions & 4 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,9 @@
+ [Use TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/v1.1/scale-a-tidb-cluster)
+ Backup and Restore
+ Use BR Tool (Recommended)
+ [Use BR Tool](/br/backup-and-restore-tool.md)
+ [BR Tool Overview](/br/backup-and-restore-tool.md)
+ [Use BR Command-line](/br/use-br-command-line-tool.md)
+ [BR Use Cases](/br/backup-and-restore-use-cases.md)
+ [BR Storages](/br/backup-and-restore-storages.md)
+ [Use Dumpling and TiDB Lightning](/backup-and-restore-using-dumpling-lightning.md)
+ [Read Historical Data](/read-historical-data.md)
+ [Configure Time Zone](/configure-time-zone.md)
+ [Daily Checklist](/daily-check.md)
Expand Down Expand Up @@ -152,8 +151,10 @@
+ [Use Cases](/ecosystem-tool-user-case.md)
+ [Download](/download-ecosystem-tools.md)
+ Backup & Restore (BR)
+ [Use BR Tool](/br/backup-and-restore-tool.md)
+ [BR Tool Overview](/br/backup-and-restore-tool.md)
+ [Use BR Command-line for Backup and Restoration](/br/backup-and-restore-tool.md)
+ [BR Use Cases](/br/backup-and-restore-use-cases.md)
+ [BR Storages](/br/backup-and-restore-storages.md)
+ [BR FAQ](/br/backup-and-restore-faq.md)
+ TiDB Binlog
+ [Overview](/tidb-binlog/tidb-binlog-overview.md)
Expand Down
6 changes: 2 additions & 4 deletions backup-and-restore-using-dumpling-lightning.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,9 @@ summary: Introduce how to use Dumpling and TiDB Lightning to backup and restore

# Use Dumpling and TiDB Lightning for Data Backup and Restoration

> **Note:**
> **Warning:**
>
> PingCAP previously maintained a fork of the [mydumper project](https://github.com/maxbube/mydumper) with enhancements specific to TiDB. This fork has since been replaced by [Dumpling](/dumpling-overview.md), which has been rewritten in Go, and supports more optimizations that are specific to TiDB. It is strongly recommended that you use Dumpling instead of mydumper.
>
> For how to perform backup and restore using Mydumper/TiDB Lightning, refer to [v4.0 documentation](https://docs.pingcap.com/tidb/v4.0/backup-and-restore-using-mydumper-lightning).
> It is no longer recommended to use Dumpling and TiDB Lightning for data backup and restoration. It is strongly recommended to use [BR tool](/br/backup-and-restore-tool.md) instead for a better tool experience.
This document introduces in detail how to use Dumpling and TiDB Lightning to backup and restore full data of TiDB. For incremental backup and replication to downstream, refer to [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md).

Expand Down
2 changes: 1 addition & 1 deletion br/backup-and-restore-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This document lists the frequently asked questions (FAQs) and the solutions abou

When you restore data, each node must have access to **all** backup files (SST files). By default, if `local` storage is used, you cannot restore data because the backup files are scattered among different nodes. Therefore, you have to copy the backup file of each TiKV node to the other TiKV nodes.

It is recommended to mount an NFS disk as a backup disk during backup. For details, see [Back up a single table to a network disk](/br/backup-and-restore-use-cases.md#back-up-a-single-table-to-a-network-disk-recommended).
It is recommended to mount an NFS disk as a backup disk during backup. For details, see [Back up a single table to a network disk](/br/backup-and-restore-use-cases.md#back-up-a-single-table-to-a-network-disk-recommended-in-production-environment).

## How much does it affect the cluster during backup using BR?

Expand Down
632 changes: 59 additions & 573 deletions br/backup-and-restore-tool.md

Large diffs are not rendered by default.

77 changes: 42 additions & 35 deletions br/backup-and-restore-use-cases.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,33 @@ summary: Learn the use cases of backing up and restoring data using BR.

# BR Use Cases

[Backup & Restore](/br/backup-and-restore-tool.md) (BR) is a command-line tool for distributed backup and restoration of the TiDB cluster data. This document describes the processes of operating BR in [four use cases](#use-cases) that aims to help you achieve the following goals:
[BR](/br/backup-and-restore-tool.md) is a tool for distributed backup and restoration of the TiDB cluster data.

This document describes how to run BR in the following use cases:

- Back up a single table to a network disk (recommended in production environment)
- Restore data from a network disk (recommended in production environment)
- Back up a single table to a local disk (recommended in testing environment)
- Restore data from a local disk (recommended in testing environment)

This document aims to help you achieve the following goals:

* Back up and restore data using a network disk or local disk correctly.
* Get the status of a backup or restoration operation through monitoring metrics.
* Learn how to tune performance during the operation.
* Troubleshoot the possible anomalies during the backup operation.

> **Note:**
>
> Pay attention to the [usage restrictions](/br/backup-and-restore-tool.md#usage-restrictions) before using BR.
## Audience

You are expected to have a basic understanding of [TiDB](https://docs.pingcap.com/tidb/v4.0) and [TiKV](https://tikv.org/). Before reading this document, it is recommended that you read [Use BR to Back up and Restore Data](/br/backup-and-restore-tool.md) first.
You are expected to have a basic understanding of [TiDB](https://docs.pingcap.com/tidb/v4.0) and [TiKV](https://tikv.org/).

Before reading on, make sure you have read [BR Tool Overview](/br/backup-and-restore-tool.md), especially [Usage Restrictions](/br/backup-and-restore-tool.md#usage-restrictions) and [Best Practices](/br/backup-and-restore-tool.md#best-practices).

## Prerequisites

This section introduces the recommended method of deploying TiDB, cluster versions, the hardware information of the TiKV cluster, and the cluster configuration for the use case demonstrations. You can estimate the performance of your backup or restoration operation based on your own hardware and configuration.
This section introduces the recommended method of deploying TiDB, cluster versions, the hardware information of the TiKV cluster, and the cluster configuration for the use case demonstrations.

You can estimate the performance of your backup or restoration operation based on your own hardware and configuration.

### Deployment method

Expand Down Expand Up @@ -56,24 +65,27 @@ BR directly sends commands to the TiKV cluster and are not dependent on the TiDB

## Use cases

This document describes the following four use cases:
This document describes the following use cases:

* [Back up a single table to a network disk (recommended)](#back-up-a-single-table-to-a-network-disk-recommended)
* [Restore data from a network disk (recommended)](#restore-data-from-a-network-disk-recommended)
* [Back up a single table to a local disk](#back-up-a-single-table-to-a-local-disk)
* [Restore data from a local disk](#restore-data-from-a-local-disk)
* [Back up a single table to a network disk (recommended in production environment)](#back-up-a-single-table-to-a-network-disk-recommended-in-production-environment)
* [Restore data from a network disk (recommended in production environment)](#restore-data-from-a-network-disk-recommended-in-production-environment)
* [Back up a single table to a local disk (recommended in testing environment)](#back-up-a-single-table-to-a-local-disk-recommended-in-testing-environment)
* [Restore data from a local disk (recommended in testing environment)](#restore-data-from-a-local-disk-recommended-in-testing-environment)

It is recommended that you use a network disk to back up and restore data. This spares you from collecting backup files and greatly improves the backup efficiency especially when the TiKV cluster is in a large scale.

> **Note:**
>
> Before the backup or restoration operation, you need to do some preparations. See [Preparation for backup](#preparation-for-backup) and [Preparation for restoration](#preparation-for-restoration) for details.
Before the backup or restoration operations, you need to do some preparations:

- [Preparation for backup](#preparation-for-backup)
- [Preparation for restoration](#preparation-for-restoration)

### Preparation for backup

For the detailed usage of the `br backup` command, refer to [BR command-line description](/br/backup-and-restore-tool.md#command-line-description).
In TiDB v4.0.8 and later versions, BR supports the self-adaptive Garbage Collection (GC). So to avoid manually configuring GC, you only need to register `backupTS` in `safePoint` in PD and make sure that `safePoint` does not move forward during the backup process.

In TiDB v4.0.7 and earlier versions, you need to manually configure GC before and after the BR backup through the following steps:

1. Before executing the `br backup` command, check the value of the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration item, and adjust the value appropriately in the MySQL client to make sure that [Garbage Collection](/garbage-collection-overview.md) (GC) does not run during the backup operation.
1. Before executing the [`br backup` command](/br/use-br-command-line-tool.md#br-command-line-description), check the value of the [`tikv_gc_life_time`](/garbage-collection-configuration.md#tikv_gc_life_time) configuration item, and adjust the value appropriately in the MySQL client to make sure that GC does not run during the backup operation.

{{< copyable "sql" >}}

Expand All @@ -90,24 +102,17 @@ For the detailed usage of the `br backup` command, refer to [BR command-line des
UPDATE mysql.tidb SET VARIABLE_VALUE = '10m' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
```

> **Note:**
>
> Since v4.0.8, BR supports the self-adaptive GC. To avoid manually adjusting GC, register `backupTS` in `safePoint` in PD and make sure that `safePoint` does not move forward during the backup process.

### Preparation for restoration

For the detailed usage of the `br restore` command, refer to [BR command-line description](/br/backup-and-restore-tool.md#command-line-description).

> **Note:**
>
> Before executing the `br restore` command, check the new cluster to make sure that the table in the cluster does not have a duplicate name.
Before executing the [`br restore` command](/br/use-br-command-line-tool.md#br-command-line-description), check the new cluster to make sure that the table in the cluster does not have a duplicate name.

### Back up a single table to a network disk (recommended)
### Back up a single table to a network disk (recommended in production environment)

Use the `br backup` command to back up the single table data `--db batchmark --table order_line` to the specified path `local:///br_data` in the network disk.

#### Backup prerequisites

* [Preparation for backup](#preparation-for-backup)
* Configure a high-performance SSD hard disk host as the NFS server to store data, and all BR nodes and TiKV nodes as NFS clients. Mount the same path (for example, `/br_data`) to the NFS server for NFS clients to access the server.
* The total transfer rate between the NFS server and all NFS clients must reach at least `the number of TiKV instances * 150MB/s`. Otherwise the network I/O might become the performance bottleneck.

Expand Down Expand Up @@ -226,13 +231,13 @@ The tuned performance results are as follows (with the same data size):
* Backup throughput: `avg speed(MB/s)` increased from `358.09` to `659.59`
* Throughput of a single TiKV instance: `avg speed(MB/s)/tikv_count` increased from `89` to `164.89`

### Restore data from a network disk (recommended)
### Restore data from a network disk (recommended in production environment)

Use the `br restore` command to restore the complete backup data to an offline cluster. Currently, BR does not support restoring data to an online cluster.

#### Restoration prerequisites

None
* [Preparation for restoration](#preparation-for-restoration)

#### Topology

Expand Down Expand Up @@ -333,12 +338,13 @@ The tuned performance results are as follows (with the same data size):
+ Throughput of a single TiKV instance: `avg speed(MB/s)`/`tikv_count` increased from `91.8` to `199.1`
+ Average restoration speed of a single TiKV instance: `total size(MB)`/(`split time` + `restore time`)/`tikv_count` increased from `87.4` to `162.3`

### Back up a single table to a local disk
### Back up a single table to a local disk (recommended in testing environment)

Use the `br backup` command to back up the single table `--db batchmark --table order_line` to the specified path `local:///home/tidb/backup_local` in the local disk.

#### Backup prerequisites

* [Preparation for backup](#preparation-for-backup)
* Each TiKV node has a separate disk to store the backupSST file.
* The `backup_endpoint` node has a separate disk to store the `backupmeta` file.
* TiKV and the `backup_endpoint` node must have the same directory for the backup (for example, `/home/tidb/backup_local`).
Expand Down Expand Up @@ -387,12 +393,13 @@ The information from the above log includes:

From the above information, the throughput of a single TiKV instance can be calculated: `avg speed(MB/s)`/`tikv_count` = `160`.

### Restore data from a local disk
### Restore data from a local disk (recommended in testing environment)

Use the `br restore` command to restore the complete backup data to an offline cluster. Currently, BR does not support restoring data to an online cluster.

#### Restoration prerequisites

* [Preparation for restoration](#preparation-for-restoration)
* The TiKV cluster and the backup data do not have a duplicate database or table. Currently, BR does not support table route.
* Each TiKV node has a separate disk to store the backupSST file.
* The `restore_endpoint` node has a separate disk to store the `backupmeta` file.
Expand Down Expand Up @@ -443,17 +450,17 @@ From the above information, the following items can be calculated:
* The throughput of a single TiKV instance: `avg speed(MB/s)`/`tikv_count` = `97.2`
* The average restoration speed of a single TiKV instance: `total size(MB)`/(`split time` + `restore time`)/`tikv_count` = `92.4`

### Error handling
## Error handling during backup

This section introduces the common errors occurred during the backup process.

#### `key locked Error` in the backup log
### `key locked Error` in the backup log

Error message in the log: `log - ["backup occur kv error"][error="{\"KvError\":{\"locked\":`

If a key is locked during the backup process, BR tries to resolve the lock. A small number of this error does not affect the correctness of the backup.
If a key is locked during the backup process, BR tries to resolve the lock. A small number of these errors do not affect the correctness of the backup.

#### Backup failure
### Backup failure

Error message in the log: `log - Error: msg:"Io(Custom { kind: AlreadyExists, error: \"[5_5359_42_123_default.sst] is already exists in /dir/backup_local/\" })"`

Expand Down
Loading

0 comments on commit 58ec534

Please sign in to comment.