Skip to content

Commit

Permalink
update sync_diff_inspector: sync_diff_inspector v2.0 (pingcap#6774)
Browse files Browse the repository at this point in the history
  • Loading branch information
Liuxiaozhen12 authored Nov 16, 2021
1 parent c9f4746 commit fb5cc29
Show file tree
Hide file tree
Showing 7 changed files with 363 additions and 397 deletions.
3 changes: 2 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,9 @@
+ sync-diff-inspector
+ [Overview](/sync-diff-inspector/sync-diff-inspector-overview.md)
+ [Data Check for Tables with Different Schema/Table Names](/sync-diff-inspector/route-diff.md)
+ [Data Check in Sharding Scenarios](/sync-diff-inspector/shard-diff.md)
+ [Data Check in the Sharding Scenario](/sync-diff-inspector/shard-diff.md)
+ [Data Check for TiDB Upstream/Downstream Clusters](/sync-diff-inspector/upstream-downstream-diff.md)
+ [Data Check in the DM Replication Scenario](/sync-diff-inspector/dm-diff.md)
+ TiSpark
+ [Quick Start](/get-started-with-tispark.md)
+ [User Guide](/tispark-overview.md)
Expand Down
Binary file modified media/shard-table-replica-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions sync-diff-inspector/dm-diff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: Data Check in the DM Replication Scenario
summary: Learn about how to set a specific `task-name` configuration from `DM-master` to perform a data check.
---

# Data Check in the DM Replication Scenario

When using replication tools such as [TiDB Data Migration](https://docs.pingcap.com/tidb-data-migration/stable/overview), you need to check the data consistency before and after the replication process. You can set a specific `task-name` configuration from `DM-master` to perform a data check.

The following is a simple configuration example. To learn the complete configuration, refer to [Sync-diff-inspector User Guide](/sync-diff-inspector/sync-diff-inspector-overview.md).

```toml
# Diff Configuration.

######################### Global config #########################

# The number of goroutines created to check data. The number of connections between upstream and downstream databases are slightly greater than this value.
check-thread-count = 4

# If enabled, SQL statements is exported to fix inconsistent tables.
export-fix-sql = true

# Only compares the table structure instead of the data.
check-struct-only = false

# The IP address of dm-master and the format is "http://127.0.0.1:8261".
dm-addr = "http://127.0.0.1:8261"

# Specifies the `task-name` of DM.
dm-task = "test"

######################### Task config #########################
[task]
output-dir = "./output"

# The tables of downstream databases to be compared. Each table needs to contain the schema name and the table name, separated by '.'
target-check-tables = ["hb_test.*"]
```

This example is configured in dm-task = "test", which checks all the tables of hb_test schema under the "test" task. It automatically gets the regular matching of the schemas between upstream and downstream databases to verify the data consistency after DM replication.
92 changes: 46 additions & 46 deletions sync-diff-inspector/route-diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,58 +6,58 @@ aliases: ['/docs/dev/sync-diff-inspector/route-diff/','/docs/dev/reference/tools

# Data Check for Tables with Different Schema or Table Names

When using replication tools such as TiDB Data Migration, you can set `route-rules` to replicate data to a specified table in the downstream. sync-diff-inspector enables you to verify tables with different schema names or table names.
When using replication tools such as [TiDB Data Migration](https://docs.pingcap.com/tidb-data-migration/stable/overview), you can set `route-rules` to replicate data to a specified table in the downstream. sync-diff-inspector enables you to verify tables with different schema names or table names by setting `rules`.

Below is a simple example.
The following is a simple configuration example. To learn the complete configuration, refer to [Sync-diff-inspector User Guide](/sync-diff-inspector/sync-diff-inspector-overview.md).

```toml
######################### Tables config #########################

# Configure the tables of the target database that need to be checked
[[check-tables]]
# The name of the schema in the target database
schema = "test_2"

# The table that needs to be checked
tables = ["t_2"]

# Configuration example of comparing two tables with different schema names and table names
[[table-config]]
# The name of the schema in the target database
schema = "test_2"

# The name of the target table
table = "t_2"

# Configuration of the source data
[[table-config.source-tables]]
# The instance ID of the source schema
instance-id = "source-1"
# The name of the source schema
schema = "test_1"
# The name of the source table
table = "t_1"
######################### Datasource config #########################
[data-sources.mysql1]
host = "127.0.0.1"
port = 3306
user = "root"
password = ""
route-rules = ["rule1"]

[data-sources.tidb0]
host = "127.0.0.1"
port = 4000
user = "root"
password = ""
########################### Routes ###########################
[routes.rule1]
schema-pattern = "test_1" # Matches the schema name of the data source. Supports the wildcards "*" and "?"
table-pattern = "t_1" # Matches the table name of the data source. Supports the wildcards "*" and "?"
target-schema = "test_2" # The name of the schema in the target database
target-table = "t_2" # The name of the target table
```

This configuration can be used to check `test_2.t_2` in the downstream and `test_1.t_1` in the `source-1` instance.
This configuration can be used to check `test_2.t_2` in the downstream and `test_1.t_1` in the `mysql1` instance.

To check a large number of tables with different schema names or table names, you can simplify the configuration by setting the mapping relationship by using `table-rule`. You can configure the mapping relationship of either schema or table, or of both. For example, all the tables in the upstream `test_1` database are replicated to the downstream `test_2` database, which can be checked through the following configuration:
To check a large number of tables with different schema names or table names, you can simplify the configuration by setting the mapping relationship by using `rules`. You can configure the mapping relationship of either schema or table, or of both. For example, all the tables in the upstream `test_1` database are replicated to the downstream `test_2` database, which can be checked through the following configuration:

```toml
######################### Tables config #########################

# Configures the tables of the target database that need to be checked
[[check-tables]]
# The name of the schema in the target database
schema = "test_2"

# Check all the tables
tables = ["~^"]

[[table-rules]]
# schema-pattern and table-pattern support the wildcards "*" and "?"
schema-pattern = "test_1"
#table-pattern = ""
target-schema = "test_2"
#target-table = ""
######################### Datasource config #########################
[data-sources.mysql1]
host = "127.0.0.1"
port = 3306
user = "root"
password = ""
route-rules = ["rule1"]

[data-sources.tidb0]
host = "127.0.0.1"
port = 4000
user = "root"
password = ""
########################### Routes ###########################
[routes.rule1]
schema-pattern = "test_1" # Matches the schema name of the data source. Supports the wildcards "*" and "?"
table-pattern = "*" # Matches the table name of the data source. Supports the wildcards "*" and "?"
target-schema = "test_2" # The name of the schema in the target database
target-table = "t_2" # The name of the target table
```

## Note

If `test_2`.`t_2` exists in the upstream database, the downstream database also compares this table.
Loading

0 comments on commit fb5cc29

Please sign in to comment.