From e65596063d6bc8c230b9cddb1e2c5aee0ff6f0ec Mon Sep 17 00:00:00 2001 From: JoyinQ <56883733+Joyinqin@users.noreply.github.com> Date: Thu, 25 Feb 2021 15:09:08 +0800 Subject: [PATCH] tools: update tools faq (#4828) * tools: update tools faq * refine the docs * Update tidb-binlog-faq.md * Update tidb-binlog-faq.md * Apply suggestions from code review Co-authored-by: glorv * Update tidb-lightning-faq.md * Apply suggestions from code review Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> Co-authored-by: glorv Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com> --- tidb-binlog/tidb-binlog-faq.md | 24 +++++++++++++++++++++++- tidb-lightning/tidb-lightning-faq.md | 15 +++++++++++++++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/tidb-binlog/tidb-binlog-faq.md b/tidb-binlog/tidb-binlog-faq.md index e66ca4731f029..a8fbdcdb015e0 100644 --- a/tidb-binlog/tidb-binlog-faq.md +++ b/tidb-binlog/tidb-binlog-faq.md @@ -128,7 +128,7 @@ If the data in the downstream is not affected, you can redeploy Drainer on the n ## How to redeploy Drainer when enabling `ignore-error` in Primary-Secondary replication triggers a critical error? -If a critical error is trigged when TiDB fails to write binlog after enabling `ignore-error`, TiDB stops writing binlog and binlog data loss occurs. To resume replication, perform the following steps: +If a critical error is triggered when TiDB fails to write binlog after enabling `ignore-error`, TiDB stops writing binlog and binlog data loss occurs. To resume replication, perform the following steps: 1. Stop the Drainer instance. @@ -248,3 +248,25 @@ To solve the problem, follow these steps: ``` 4. Modify the `drainer.toml` configuration file. Add the `commit-ts` in the `ignore-txn-commit-ts` item and restart the Drainer node. + +## TiDB fails to write to binlog and gets stuck, and `listener stopped, waiting for manual stop` appears in the log + +In TiDB v3.0.12 and earlier versions, the binlog write failure causes TiDB to report the fatal error. TiDB does not automatically exit but only stops the service, which seems like getting stuck. You can see the `listener stopped, waiting for manual stop` error in the log. + +You need to determine the specific causes of the binlog write failure. If the failure occurs because binlog is slowly written into the downstream, you can consider scaling out Pump or increasing the timeout time for writing binlog. + +Since v3.0.13, the error-reporting logic is optimized. The binlog write failure causes transaction execution to fail and TiDB Binlog will return an error but will not get TiDB stuck. + +## TiDB writes duplicate binlogs to Pump + +This issue does not affect the downstream and replication logic. + +When the binlog write fails or becomes timeout, TiDB retries writing binlog to the next available Pump node until the write succeeds. Therefore, if the binlog write to a Pump node is slow and causes TiDB timeout (default 15s), then TiDB determines that the write fails and tries to write to the next Pump node. If binlog is actually successfully written to the timeout-causing Pump node, the same binlog is written to multiple Pump nodes. When Drainer processes the binlog, it automatically de-duplicates binlogs with the same TSO, so this duplicate write does not affect the downstream and replication logic. + +## Reparo is interrupted during the full and incremental restore process. Can I use the last TSO in the log to resume replication? + +Yes. Reparo does not automatically enable the safe-mode when you start it. You need to perform the following steps manually: + +1. After Reparo is interrupted, record the last TSO in the log as `checkpoint-tso`. +2. Modify the Reparo configuration file, set the configuration item `start-tso` to `checkpoint-tso + 1`, set `stop-tso` to `checkpoint-tso + 80,000,000,000` (approximately five minutes after the `checkpoint-tso`), and set `safe-mode` to `true`. Start Reparo, and Reparo replicates data to `stop-tso` and then stops automatically. +3. After Reparo stops automatically, set `start-tso` to `checkpoint tso + 80,000,000,001`, set `stop-tso` to `0`, and set `safe-mode` to `false`. Start Reparo to resume replication. diff --git a/tidb-lightning/tidb-lightning-faq.md b/tidb-lightning/tidb-lightning-faq.md index 9b8c94525ca29..3deb31a2246a6 100644 --- a/tidb-lightning/tidb-lightning-faq.md +++ b/tidb-lightning/tidb-lightning-faq.md @@ -333,3 +333,18 @@ Currently, the limitation of TiDB cannot be bypassed. You can only ignore this t - If there are TiFlash nodes in the cluster, you can update the cluster to `v4.0.0-rc.2` or higher versions. - Temporarily disable TiFlash if you do not want to upgrade the cluster. + +## `tidb lightning encountered error: TiDB version too old, expected '>=4.0.0', found '3.0.18'` + +TiDB Lightning Local-backend only supports importing data to TiDB clusters of v4.0.0 and later versions. If you try to use Local-backend to import data to a v2.x or v3.x cluster, the above error is reported. At this time, you can modify the configuration to use Importer-backend or TiDB-backend for data import. + +Some `nightly` versions might be similar to v4.0.0-beta.2. These `nightly` versions of TiDB Lightning actually support Local-backend. If you encounter this error when using a `nightly` version, you can skip the version check by setting the configuration `check-requirements = false`. Before setting this parameter, make sure that the configuration of TiDB Lightning supports the corresponding version; otherwise, the import might fail. + +## `restore table test.district failed: unknown columns in header [...]` + +This error occurs usually because the CSV data file does not contain a header (the first row is not column names but data). Therefore, you need to add the following configuration to the TiDB Lightning configuration file: + +``` +[mydumper.csv] +header = false +```