Skip to content

Commit

Permalink
ticdc: refine some descriptions (#11651)
Browse files Browse the repository at this point in the history
  • Loading branch information
shichun-0415 authored Jan 11, 2023
1 parent 914e2c5 commit f56542e
Show file tree
Hide file tree
Showing 17 changed files with 74 additions and 63 deletions.
4 changes: 2 additions & 2 deletions migrate-from-tidb-to-mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,12 +158,12 @@ After setting up the environment, you can use [Dumpling](/dumpling-overview.md)
In the upstream cluster, run the following command to create a changefeed from the upstream to the downstream clusters:

```shell
tiup ctl:<cluster-version> cdc changefeed create --pd=http://127.0.0.1:2379 --sink-uri="mysql://root:@127.0.0.1:3306" --changefeed-id="upstream-to-downstream" --start-ts="434217889191428107"
tiup ctl:<cluster-version> cdc changefeed create --server=http://127.0.0.1:8300 --sink-uri="mysql://root:@127.0.0.1:3306" --changefeed-id="upstream-to-downstream" --start-ts="434217889191428107"
```

In this command, the parameters are as follows:

- `--pd`: PD address of the upstream cluster
- `--server`: IP address of any node in the TiCDC cluster
- `--sink-uri`: URI of the downstream cluster
- `--changefeed-id`: changefeed ID, must be in the format of a regular expression, `^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$`
- `--start-ts`: start timestamp of the changefeed, must be the backup time (or BackupTS in the "Back up data" section in [Step 2. Migrate full data](#step-2-migrate-full-data))
Expand Down
8 changes: 4 additions & 4 deletions migrate-from-tidb-to-tidb.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,12 +219,12 @@ After setting up the environment, you can use the backup and restore functions o
{{< copyable "shell-regular" >}}

```shell
tiup cdc cli changefeed create --pd=http://172.16.6.122:2379 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="upstream-to-downstream" --start-ts="431434047157698561"
tiup cdc cli changefeed create --server=http://172.16.6.122:8300 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="upstream-to-downstream" --start-ts="431434047157698561"
```

In this command, the parameters are as follows:

- `--pd`: PD address of the upstream cluster
- `--server`: IP address of any node in the TiCDC cluster
- `--sink-uri`: URI of the downstream cluster
- `--changefeed-id`: changefeed ID, must be in the format of a regular expression, ^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$
- `--start-ts`: start timestamp of the changefeed, must be the backup time (or BackupTS in the "Back up data" section in [Step 2. Migrate full data](#step-2-migrate-full-data))
Expand Down Expand Up @@ -268,7 +268,7 @@ After creating a changefeed, data written to the upstream cluster is replicated

```shell
# Stop the changefeed from the upstream cluster to the downstream cluster
tiup cdc cli changefeed pause -c "upstream-to-downstream" --pd=http://172.16.6.122:2379
tiup cdc cli changefeed pause -c "upstream-to-downstream" --server=http://172.16.6.122:8300
# View the changefeed status
tiup cdc cli changefeed list
Expand All @@ -291,7 +291,7 @@ After creating a changefeed, data written to the upstream cluster is replicated
2. Create a changefeed from downstream to upstream. You can leave `start-ts` unspecified so as to use the default setting, because the upstream and downstream data are consistent and there is no new data written to the cluster.

```shell
tiup cdc cli changefeed create --pd=http://172.16.6.125:2379 --sink-uri="mysql://root:@172.16.6.122:4000" --changefeed-id="downstream -to-upstream"
tiup cdc cli changefeed create --server=http://172.16.6.125:8300 --sink-uri="mysql://root:@172.16.6.122:4000" --changefeed-id="downstream -to-upstream"
```

3. After migrating writing services to the downstream cluster, observe for a period. If the downstream cluster is stable, you can discard the upstream cluster.
6 changes: 3 additions & 3 deletions replicate-between-primary-and-secondary-clusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,12 +233,12 @@ After setting up the environment, you can use the backup and restore functions o
In the upstream cluster, run the following command to create a changefeed from the upstream to the downstream clusters:

```shell
tiup cdc cli changefeed create --pd=http://172.16.6.122:2379 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="primary-to-secondary" --start-ts="431434047157698561"
tiup cdc cli changefeed create --server=http://172.16.6.122:8300 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="primary-to-secondary" --start-ts="431434047157698561"
```

In this command, the parameters are as follows:

- `--pd`: PD address of the upstream cluster
- `--server`: IP address of any node in the TiCDC cluster
- `--sink-uri`: URI of the downstream cluster
- `--start-ts`: start timestamp of the changefeed, must be the backup time (or BackupTS mentioned in [Step 2. Migrate full data](#step-2-migrate-full-data))

Expand Down Expand Up @@ -312,5 +312,5 @@ After the previous step, the downstream (secondary) cluster has data that is con

```shell
# Create a changefeed
tiup cdc cli changefeed create --pd=http://172.16.6.122:2379 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="primary-to-secondary"
tiup cdc cli changefeed create --server=http://172.16.6.122:8300 --sink-uri="mysql://root:@172.16.6.125:4000" --changefeed-id="primary-to-secondary"
```
6 changes: 3 additions & 3 deletions replicate-data-to-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ The preceding steps are performed in a lab environment. You can also deploy a cl
2. Create a changefeed to replicate incremental data to Kafka:

```shell
tiup ctl:<cluster-version> cdc changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://127.0.0.1:9092/kafka-topic-name?protocol=canal-json" --changefeed-id="kafka-changefeed" --config="changefeed.conf"
tiup ctl:<cluster-version> cdc changefeed create --server="http://127.0.0.1:8300" --sink-uri="kafka://127.0.0.1:9092/kafka-topic-name?protocol=canal-json" --changefeed-id="kafka-changefeed" --config="changefeed.conf"
```

- If the changefeed is successfully created, changefeed information, such as changefeed ID, is displayed, as shown below:
Expand All @@ -73,13 +73,13 @@ The preceding steps are performed in a lab environment. You can also deploy a cl
In a production environment, a Kafka cluster has multiple broker nodes. Therefore, you can add the addresses of multiple brokers to the sink UIR. This ensures stable access to the Kafka cluster. When the Kafka cluster is down, the changefeed still works. Suppose that a Kafka cluster has three broker nodes, with IP addresses being 127.0.0.1:9092, 127.0.0.2:9092, and 127.0.0.3:9092, respectively. You can create a changefeed with the following sink URI.

```shell
tiup ctl:<cluster-version> cdc changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://127.0.0.1:9092,127.0.0.2:9092,127.0.0.3:9092/kafka-topic-name?protocol=canal-json&partition-num=3&replication-factor=1&max-message-bytes=1048576" --config="changefeed.conf"
tiup ctl:<cluster-version> cdc changefeed create --server="http://127.0.0.1:8300" --sink-uri="kafka://127.0.0.1:9092,127.0.0.2:9092,127.0.0.3:9092/kafka-topic-name?protocol=canal-json&partition-num=3&replication-factor=1&max-message-bytes=1048576" --config="changefeed.conf"
```

3. After creating the changefeed, run the following command to check the changefeed status:

```shell
tiup ctl:<cluster-version> cdc changefeed list --pd="http://127.0.0.1:2379"
tiup ctl:<cluster-version> cdc changefeed list --server="http://127.0.0.1:8300"
```

You can refer to [Manage TiCDC Changefeeds](/ticdc/ticdc-manage-changefeed.md) to manage the changefeed.
Expand Down
13 changes: 9 additions & 4 deletions ticdc/deploy-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ When you upgrade a TiCDC cluster, you need to pay attention to the following:

## Modify TiCDC cluster configurations using TiUP

This section describes how to use the [`tiup cluster edit-config`](/tiup/tiup-component-cluster-edit-config.md) command to modify the configurations of TiCDC. In the following example, it is assumed that you need to change the default value of `gc-ttl` from `86400` to `3600` (1 hour).
This section describes how to use the [`tiup cluster edit-config`](/tiup/tiup-component-cluster-edit-config.md) command to modify the configurations of TiCDC. In the following example, it is assumed that you need to change the default value of `gc-ttl` from `86400` to `172800` (48 hours).

1. Run the `tiup cluster edit-config` command. Replace `<cluster-name>` with the actual cluster name:

Expand All @@ -131,9 +131,11 @@ This section describes how to use the [`tiup cluster edit-config`](/tiup/tiup-co
pump: {}
drainer: {}
cdc:
gc-ttl: 3600
gc-ttl: 172800
```

In the preceding command, `gc-ttl` is set to 48 hours.

3. Run the `tiup cluster reload -R cdc` command to reload the configuration.

## Stop and start TiCDC using TiUP
Expand Down Expand Up @@ -161,16 +163,19 @@ tiup ctl:<version> cdc capture list --server=http://10.0.10.25:8300
{
"id": "806e3a1b-0e31-477f-9dd6-f3f2c570abdd",
"is-owner": true,
"address": "127.0.0.1:8300"
"address": "127.0.0.1:8300",
"cluster-id": "default"
},
{
"id": "ea2a4203-56fe-43a6-b442-7b295f458ebc",
"is-owner": false,
"address": "127.0.0.1:8301"
"address": "127.0.0.1:8301",
"cluster-id": "default"
}
]
```

- `id`: Indicates the ID of the service process.
- `is-owner`: Indicates whether the service process is the owner node.
- `address`: Indicates the address via which the service process provides interface to the outside.
- `cluster-id`: Indicates the ID of the TiCDC cluster. The default value is `default`.
6 changes: 3 additions & 3 deletions ticdc/integrate-confluent-using-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ The preceding steps are performed in a lab environment. You can also deploy a cl
2. Create a changefeed to replicate incremental data to Confluent Cloud:

```shell
tiup ctl:<cluster-version> cdc changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://<broker_endpoint>/ticdc-meta?protocol=avro&replication-factor=3&enable-tls=true&auto-create-topic=true&sasl-mechanism=plain&sasl-user=<broker_api_key>&sasl-password=<broker_api_secret>" --schema-registry="https://<schema_registry_api_key>:<schema_registry_api_secret>@<schema_registry_endpoint>" --changefeed-id="confluent-changefeed" --config changefeed.conf
tiup ctl:<cluster-version> cdc changefeed create --server="http://127.0.0.1:8300" --sink-uri="kafka://<broker_endpoint>/ticdc-meta?protocol=avro&replication-factor=3&enable-tls=true&auto-create-topic=true&sasl-mechanism=plain&sasl-user=<broker_api_key>&sasl-password=<broker_api_secret>" --schema-registry="https://<schema_registry_api_key>:<schema_registry_api_secret>@<schema_registry_endpoint>" --changefeed-id="confluent-changefeed" --config changefeed.conf
```

You need to replace the values of the following fields with those created or recorded in [Step 2. Create an access key pair](#step-2-create-an-access-key-pair):
Expand All @@ -114,7 +114,7 @@ The preceding steps are performed in a lab environment. You can also deploy a cl
Note that you should encode `<schema_registry_api_secret>` based on [HTML URL Encoding Reference](https://www.w3schools.com/tags/ref_urlencode.asp) before replacing its value. After you replace all the preceding fields, the configuration file is as follows:

```shell
tiup ctl:<cluster-version> cdc changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://xxx-xxxxx.ap-east-1.aws.confluent.cloud:9092/ticdc-meta?protocol=avro&replication-factor=3&enable-tls=true&auto-create-topic=true&sasl-mechanism=plain&sasl-user=L5WWA4GK4NAT2EQV&sasl-password=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" --schema-registry="https://7NBH2CAFM2LMGTH7:xxxxxxxxxxxxxxxxxx@yyy-yyyyy.us-east-2.aws.confluent.cloud" --changefeed-id="confluent-changefeed" --config changefeed.conf
tiup ctl:<cluster-version> cdc changefeed create --server="http://127.0.0.1:8300" --sink-uri="kafka://xxx-xxxxx.ap-east-1.aws.confluent.cloud:9092/ticdc-meta?protocol=avro&replication-factor=3&enable-tls=true&auto-create-topic=true&sasl-mechanism=plain&sasl-user=L5WWA4GK4NAT2EQV&sasl-password=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" --schema-registry="https://7NBH2CAFM2LMGTH7:xxxxxxxxxxxxxxxxxx@yyy-yyyyy.us-east-2.aws.confluent.cloud" --changefeed-id="confluent-changefeed" --config changefeed.conf
```

- Run the command to create a changefeed.
Expand All @@ -132,7 +132,7 @@ The preceding steps are performed in a lab environment. You can also deploy a cl
3. After creating the changefeed, run the following command to check the changefeed status:

```shell
tiup ctl:<cluster-version> cdc changefeed list --pd="http://127.0.0.1:2379"
tiup ctl:<cluster-version> cdc changefeed list --server="http://127.0.0.1:8300"
```

You can refer to [Manage TiCDC Changefeeds](/ticdc/ticdc-manage-changefeed.md) to manage the changefeed.
Expand Down
2 changes: 1 addition & 1 deletion ticdc/monitor-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ If you use TiUP to deploy the TiDB cluster, you can see a sub-dashboard for TiCD
The metric description in this document is based on the following replication task example, which replicates data to MySQL using the default configuration.

```shell
cdc cli changefeed create --pd=http://10.0.10.25:2379 --sink-uri="mysql://root:123456@127.0.0.1:3306/" --changefeed-id="simple-replication-task"
cdc cli changefeed create --server=http://10.0.10.25:8300 --sink-uri="mysql://root:123456@127.0.0.1:3306/" --changefeed-id="simple-replication-task"
```

The TiCDC dashboard contains four monitoring panels. See the following screenshot:
Expand Down
12 changes: 7 additions & 5 deletions ticdc/ticdc-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Changefeed and Task in TiCDC are two logical concepts. The specific description
For example:

```
cdc cli changefeed create --pd=http://10.0.10.25:2379 --sink-uri="kafka://127.0.0.1:9092/cdc-test?kafka-version=2.4.0&partition-num=6&max-message-bytes=67108864&replication-factor=1"
cdc cli changefeed create --server="http://127.0.0.1:8300" --sink-uri="kafka://127.0.0.1:9092/cdc-test?kafka-version=2.4.0&partition-num=6&max-message-bytes=67108864&replication-factor=1"
cat changefeed.toml
......
[sink]
Expand Down Expand Up @@ -139,17 +139,19 @@ The preceding sections only cover data changes of DML statements and do not incl
#### Barrier TS
Barrier TS is generated when a DDL statement is executed or a Syncpoint is used.
Barrier TS is generated when there are DDL change events or a Syncpoint is used.
- This timestamp ensures that all changes before this DDL statement are replicated to the downstream. After this DDL statement is executed and replicated, TiCDC starts replicating other data changes. Because DDL statements are processed by the Capture Owner, the Barrier TS corresponding to a DDL statement is only generated by the Processor thread of the owner node.
- Syncpoint Barrier TS is also a timestamp. When you enable the Syncpoint feature of TiCDC, a Barrier TS is generated by TiCDC according to the `sync-point-interval` you specified. When all table changes before this Barrier TS are replicated, TiCDC records the global Checkpoint in downstream, from which data replication continues next time.
- DDL change events: Barrier TS ensures that all changes before the DDL statement are replicated to the downstream. After this DDL statement is executed and replicated, TiCDC starts replicating other data changes. Because DDL statements are processed by the Capture Owner, the Barrier TS corresponding to a DDL statement is only generated by the owner node.
- Syncpoint: When you enable the Syncpoint feature of TiCDC, a Barrier TS is generated by TiCDC according to the `sync-point-interval` you specified. When all table changes before this Barrier TS are replicated, TiCDC inserts the current global CheckpointTS as the primary TS to the table recording tsMap in downstream. Then TiCDC continues data replication.
After a Barrier TS is generated, TiCDC only replicates data changes that occur before this Barrier TS to downstream. Then TiCDC checks whether all target data has been replicated by comparing the global CheckpointTS and Barrier TS. If global CheckpointTS equals to Barrier TS, TiCDC continues replication after performing a designated operation (such as executing a DDL statement or recording the global CheckpointTS downstream). Otherwise, TiCDC waits for all data changes that occur before Barrier TS to be replicated to the downstream.
After a Barrier TS is generated, TiCDC ensures that only data changes that occur before this Barrier TS are replicated to downstream. Before these data changes are replicated to downstream, the replication task does not proceed. The owner TiCDC checks whether all target data has been replicated by continuously comparing the global CheckpointTS and the Barrier TS. If the global CheckpointTS equals to the Barrier TS, TiCDC continues replication after performing a designated operation (such as executing a DDL statement or recording the global CheckpointTS downstream). Otherwise, TiCDC waits for all data changes that occur before the Barrier TS to be replicated to the downstream.
## Major processes
This section describes the major processes of TiCDC to help you better understand its working principles.
Note that the following processes occur only within TiCDC and are transparent to users. Therefore, you do not need to care about which TiCDC node you are starting.
### Start TiCDC
- For a TiCDC node that is not an owner, it works as follows:
Expand Down
6 changes: 3 additions & 3 deletions ticdc/ticdc-avro-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The following is a configuration example using Avro:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://127.0.0.1:2379 --changefeed-id="kafka-avro" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="kafka-avro" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
```

```shell
Expand All @@ -41,7 +41,7 @@ The following is a configuration example:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://127.0.0.1:2379 --changefeed-id="kafka-avro-enable-extension" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro&enable-tidb-extension=true" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="kafka-avro-enable-extension" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro&enable-tidb-extension=true" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
```

```shell
Expand Down Expand Up @@ -207,7 +207,7 @@ The following is a configuration example:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://127.0.0.1:2379 --changefeed-id="kafka-avro-string-option" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro&avro-decimal-handling-mode=string&avro-bigint-unsigned-handling-mode=string" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="kafka-avro-string-option" --sink-uri="kafka://127.0.0.1:9092/topic-name?protocol=avro&avro-decimal-handling-mode=string&avro-bigint-unsigned-handling-mode=string" --schema-registry=http://127.0.0.1:8081 --config changefeed_config.toml
```

```shell
Expand Down
4 changes: 2 additions & 2 deletions ticdc/ticdc-canal-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The following is an example of using `Canal-JSON`:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://127.0.0.1:2379 --changefeed-id="kafka-canal-json" --sink-uri="kafka://127.0.0.1:9092/topic-name?kafka-version=2.4.0&protocol=canal-json"
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="kafka-canal-json" --sink-uri="kafka://127.0.0.1:9092/topic-name?kafka-version=2.4.0&protocol=canal-json"
```

## TiDB extension field
Expand All @@ -37,7 +37,7 @@ The following is an example:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://127.0.0.1:2379 --changefeed-id="kafka-canal-json-enable-tidb-extension" --sink-uri="kafka://127.0.0.1:9092/topic-name?kafka-version=2.4.0&protocol=canal-json&enable-tidb-extension=true"
cdc cli changefeed create --server=http://127.0.0.1:8300 --changefeed-id="kafka-canal-json-enable-tidb-extension" --sink-uri="kafka://127.0.0.1:9092/topic-name?kafka-version=2.4.0&protocol=canal-json&enable-tidb-extension=true"
```

## Definitions of message formats
Expand Down
Loading

0 comments on commit f56542e

Please sign in to comment.