Skip to content

Commit

Permalink
add import data best practice (#142)
Browse files Browse the repository at this point in the history
* add import data best practice
  • Loading branch information
IANTHEREAL authored Jul 25, 2017
1 parent fe91c1f commit 1fe3d2d
Showing 1 changed file with 26 additions and 6 deletions.
32 changes: 26 additions & 6 deletions op-guide/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,13 +145,13 @@ You can use `mydumper` to export data from MySQL and `loader` to import the data
Use the `mydumper` tool to export data from MySQL by using the following command:

```bash
./bin/mydumper -h 127.0.0.1 -P 3306 -u root -t 16 -F 128 -B test -T t1,t2 --skip-tz-utc -o ./var/test
./bin/mydumper -h 127.0.0.1 -P 3306 -u root -t 16 -F 64 -B test -T t1,t2 --skip-tz-utc -o ./var/test
```
In this command,
+ `-B test`: means the data is exported from the `test` database.
+ `-T t1,t2`: means only the `t1` and `t2` tables are exported.
+ `-t 16`: means 16 threads are used to export the data.
+ `-F 128`: means a table is partitioned into chunks and one chunk is 128MB.
+ `-F 64`: means a table is partitioned into chunks and one chunk is 64MB.
+ `--skip-tz-utc`: the purpose of adding this parameter is to ignore the inconsistency of time zone setting between MySQL and the data exporting machine and to disable automatic conversion.

**Note:**
Expand All @@ -163,12 +163,9 @@ On the Cloud platforms which require the `super privilege`, such as on the Aliyu
Use the `loader` tool to import the data from MySQL to TiDB. See [Loader instructions](./tools/loader.md) for more information.
```bash
./bin/loader -h 127.0.0.1 -u root -P 4000 -t 4 -d ./var/test
./bin/loader -h 127.0.0.1 -u root -P 4000 -t 32 -d ./var/test
```
In this command,
+ `-q 1` means how many queries are included in each transaction. The default value is 1. When importing data to TiDB, it is recommended to use the default value.
After the data is imported, you can view the data in TiDB using the MySQL client:
```bash
Expand Down Expand Up @@ -201,6 +198,29 @@ mysql> select * from t2;
+----+------+
```
### 3. Best practice
To migrate data quickly, especially for huge amount of data, you can refer to the following recommendations.
- Make the exported data file as small as possible and it is recommended not to exceed 64M. You can use the -F parameter to set the value.
- The -t parameter can be evaluated according to the number and the load of TiKV instances. For example, if there are three TiKV instances, -t can be set to 3 * (1 ~ n). If the load of TiKV is too high and the log `backoffer.maxSleep 15000ms is exceeded` appears many times, it's better to decrease -t; otherwise increase it.

### A sample, and the configuration

- The total size of the exported files is 214G. A single table has 8 columns and 2 billion rows.
- The cluster topology:
- 12 TiKV instances: 4 nodes, 3 TiKV instances per node
- 4 TiDB instances
- 3 PD instances
- The configuration of each node:
- CPU: Intel Xeon E5-2670 v3 @ 2.30GHz
- 48 vCPU [2 x 12 physical cores]
- Memory: 128G
- Disk: sda [raid 10, 300G] sdb[RAID 5, 2T]
- OS: CentOS 7.2
- the F parameter of mydumper is set to 16 and the -t parameter of loader is set to 64.

**Results**: It takes 11 hours to import all the data, which is 19.4G per hour.

## Step 3. (Optional) Using the `syncer` tool to import data incrementally

The previous section introduces how to import all the history data from MySQL to TiDB using `mydumper`/`loader`. But this is not applicable if the data in MySQL is updated after the migration and it is expected to import the updated data quickly.
Expand Down

0 comments on commit 1fe3d2d

Please sign in to comment.