add import data best practice (#142)

* add import data best practice
pingcap · Jul 25, 2017 · 1fe3d2d · 1fe3d2d
1 parent fe91c1f
commit 1fe3d2d
Showing 1 changed file with 26 additions and 6 deletions.
diff --git a/op-guide/migration.md b/op-guide/migration.md
@@ -145,13 +145,13 @@ You can use `mydumper` to export data from MySQL and `loader` to import the data
 Use the `mydumper` tool to export data from MySQL by using the following command:
 
 ```bash
-./bin/mydumper -h 127.0.0.1 -P 3306 -u root -t 16 -F 128 -B test -T t1,t2 --skip-tz-utc -o ./var/test
+./bin/mydumper -h 127.0.0.1 -P 3306 -u root -t 16 -F 64 -B test -T t1,t2 --skip-tz-utc -o ./var/test
 ```
 In this command, 
 + `-B test`: means the data is exported from the `test` database.
 + `-T t1,t2`: means only the `t1` and `t2` tables are exported.
 + `-t 16`: means 16 threads are used to export the data.
-+ `-F 128`: means a table is partitioned into chunks and one chunk is 128MB.
++ `-F 64`: means a table is partitioned into chunks and one chunk is 64MB.
 + `--skip-tz-utc`: the purpose of adding this parameter is to ignore the inconsistency of time zone setting between MySQL and the data exporting machine and to disable automatic conversion.
 
 **Note:**
@@ -163,12 +163,9 @@ On the Cloud platforms which require the `super privilege`, such as on the Aliyu
 Use the `loader` tool to import the data from MySQL to TiDB. See [Loader instructions](./tools/loader.md) for more information.
 
 ```bash
-./bin/loader -h 127.0.0.1 -u root -P 4000 -t 4 -d ./var/test
+./bin/loader -h 127.0.0.1 -u root -P 4000 -t 32 -d ./var/test
 ```
 
-In this command, 
-+ `-q 1` means how many queries are included in each transaction. The default value is 1. When importing data to TiDB, it is recommended to use the default value.
-
 After the data is imported, you can view the data in TiDB using the MySQL client:
 
 ```bash
@@ -201,6 +198,29 @@ mysql> select * from t2;
 +----+------+
 ```
 
+### 3. Best practice
+To migrate data quickly, especially for huge amount of data, you can refer to the following recommendations.
+
+- Make the exported data file as small as possible and it is recommended not to exceed 64M. You can use the -F parameter to set the value.
+- The -t parameter can be evaluated according to the number and the load of TiKV instances. For example, if there are three TiKV instances, -t can be set to 3 * (1 ~ n). If the load of TiKV is too high and the log `backoffer.maxSleep 15000ms is exceeded` appears many times, it's better to decrease -t; otherwise increase it.
+
+### A sample, and the configuration
+
+ - The total size of the exported files is 214G. A single table has 8 columns and 2 billion rows.
+ - The cluster topology:
+    - 12 TiKV instances: 4 nodes, 3 TiKV instances per node
+    - 4 TiDB instances
+    - 3 PD instances
+ - The configuration of each node:
+    - CPU: Intel Xeon E5-2670 v3 @ 2.30GHz
+    - 48 vCPU [2 x 12 physical cores]
+    - Memory: 128G
+    - Disk: sda [raid 10, 300G] sdb[RAID 5, 2T]
+    - OS: CentOS 7.2
+ - the F parameter of mydumper is set to 16 and the -t parameter of loader is set to 64.
+
+**Results**: It takes 11 hours to import all the data, which is 19.4G per hour.
+
 ## Step 3. (Optional) Using the `syncer` tool to import data incrementally
 
 The previous section introduces how to import all the history data from MySQL to TiDB using `mydumper`/`loader`. But this is not applicable if the data in MySQL is updated after the migration and it is expected to import the updated data quickly.