From 66fce1409bdb1a976e09f755e4d75453b5f8be35 Mon Sep 17 00:00:00 2001
From: ti-srebot <66930949+ti-srebot@users.noreply.github.com>
Date: Wed, 17 Jun 2020 20:32:24 +0800
Subject: [PATCH] cherry pick #3655 to release-4.0 (#3698)

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

Co-authored-by: WangXiangUSTC <wx347249478@gmail.com>
---
 TOC.md                                        |  1 +
 loader-overview.md                            | 34 +++++++++++++++++++
 .../get-started-with-tidb-binlog.md           |  1 -
 3 files changed, 35 insertions(+), 1 deletion(-)
 rename get-started-with-tidb-binlog.md => tidb-binlog/get-started-with-tidb-binlog.md (99%)

diff --git a/TOC.md b/TOC.md
index e9ad011e86d3..1b79f2df1f7f 100644
--- a/TOC.md
+++ b/TOC.md
@@ -153,6 +153,7 @@
     + [BR 备份与恢复场景示例](/br/backup-and-restore-use-cases.md)
   + TiDB Binlog
     + [概述](/tidb-binlog/tidb-binlog-overview.md)
+    + [快速上手](/tidb-binlog/get-started-with-tidb-binlog.md)
     + [部署使用](/tidb-binlog/deploy-tidb-binlog.md)
     + [运维管理](/tidb-binlog/maintain-tidb-binlog-cluster.md)
     + [配置说明](/tidb-binlog/tidb-binlog-configuration-file.md)
diff --git a/loader-overview.md b/loader-overview.md
index 236d3ce92e19..889b48b836dc 100644
--- a/loader-overview.md
+++ b/loader-overview.md
@@ -157,3 +157,37 @@ pattern-table = "table_*"
 target-schema = "example_db"
 target-table = "table"
 ```
+
+### 全量导入过程中遇到报错 `packet for query is too large. Try adjusting the 'max_allowed_packet' variable`
+
+#### 原因
+
+* MySQL client 和 MySQL/TiDB Server 都有 `max_allowed_packet` 配额的限制，如果在使用过程中违反其中任何一个 `max_allowed_packet` 配额，客户端程序就会收到对应的报错。目前最新版本的 Loader 和 TiDB Server 的默认 `max_allowed_packet` 配额都为 `64M`。
+
+    * 请使用最新版本，或者最新稳定版本的工具。[下载页面](/download-ecosystem-tools.md)。
+
+* Loader 的全量数据导入处理模块不支持对 dump sqls 文件进行切分，原因是 Mydumper 采用了最简单的编码实现，正如 Mydumper 代码注释 `/* Poor man's data dump code */` 所言。如果在 Loader 实现文件切分，那么需要在 `TiDB parser` 基础上实现一个完备的解析器才能正确的处理数据切分，但是随之会带来以下的问题：
+
+    * 工作量大
+
+    * 复杂度高，不容易保证正确性
+
+    * 性能的极大降低
+
+#### 解决方案
+
+* 依据上面的原因，在代码层面不能简单的解决这个困扰，我们推荐的方式是：利用 Mydumper 提供的控制 `Insert Statement` 大小的功能 `-s, --statement-size`: `Attempted size of INSERT statement in bytes, default 1000000`。
+
+    依据默认的 `--statement-size` 设置，Mydumper 默认生成的 `Insert Statement` 大小会尽量接近在 `1M` 左右，使用默认值就可以确保绝大部分情况不会出现该问题。
+
+    有时候在 dump 过程中会出现下面的 `WARN` log，但是这个报错不影响 dump 的过程，只是表达了 dump 的表可能是宽表。
+
+    ```
+    Row bigger than statement_size for xxx
+    ```
+
+* 如果宽表的单行超过了 `64M`，那么需要修改以下两个配置，并且使之生效。
+
+    * 在 TiDB Server 执行 `set @@global.max_allowed_packet=134217728` （`134217728 = 128M`）
+
+    * 根据实际情况为 Loader 的配置文件中的 db 配置增加 `max-allowed-packet=128M`，然后重启进程或者任务
diff --git a/get-started-with-tidb-binlog.md b/tidb-binlog/get-started-with-tidb-binlog.md
similarity index 99%
rename from get-started-with-tidb-binlog.md
rename to tidb-binlog/get-started-with-tidb-binlog.md
index aa96c9e0b764..ee1571ac97b1 100644
--- a/get-started-with-tidb-binlog.md
+++ b/tidb-binlog/get-started-with-tidb-binlog.md
@@ -519,4 +519,3 @@ sleep 3
 本文档介绍了如何通过设置 TiDB Binlog，使用单个 Pump 和 Drainer 组成的集群同步 TiDB 集群数据到下游的 MariaDB。可以发现，TiDB Binlog 是用于获取处理 TiDB 集群中更新数据的综合性平台工具。
 
 在更稳健的开发、测试或生产部署环境中，可以使用多个 TiDB 服务以实现高可用性和扩展性。使用多个 Pump 实例可以避免 Pump 集群中的问题影响发送到 TiDB 实例的应用流量。或者可以使用增加的 Drainer 实例同步数据到不同的下游或实现数据增量备份。
-binlog
\ No newline at end of file