Skip to content

Commit

Permalink
ticdc: add scale out for kafka changefeed (pingcap#12693)
Browse files Browse the repository at this point in the history
  • Loading branch information
overvenus authored Feb 10, 2023
1 parent 7608db3 commit 4b3c34f
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 1 deletion.
9 changes: 8 additions & 1 deletion ticdc/ticdc-changefeed-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ matcher = ["test.worker"] # matcher 是一个白名单,表示该过滤规则
ignore-event = ["insert"] # 过滤掉 insert 事件
ignore-sql = ["^drop", "add column"] # 过滤掉以 "drop" 开头或者包含 "add column" 的 DDL
ignore-delete-value-expr = "name = 'john'" # 过滤掉包含 name = 'john' 条件的 delete DML
ignore-insert-value-expr = "id >= 100" # 过滤掉包含 id >= 100 条件的 insert DML
ignore-insert-value-expr = "id >= 100" # 过滤掉包含 id >= 100 条件的 insert DML
ignore-update-old-value-expr = "age < 18" # 过滤掉旧值 age < 18 的 update DML
ignore-update-new-value-expr = "gender = 'male'" # 过滤掉新值 gender = 'male' 的 update DML
Expand All @@ -89,6 +89,13 @@ ignore-event = ["drop table"] # 忽略 drop table 事件
ignore-sql = ["delete"] # 忽略 delete DML
ignore-insert-value-expr = "price > 1000 and origin = 'no where'" # 忽略包含 price > 1000 和 origin = 'no where' 条件的 insert DML
[scheduler]
# 将表按 Region 个数划分成多个同步范围,这些范围可由多个 TiCDC 节点同步。
# 注意:
# 1. 该参数只在 Kafka changefeed 上生效,暂不支持 MySQL changefeed。
# 2. TiCDC 不会将小于该参数 Region 个数的表划分成多个同步范围。
# region-per-span = 50000
[sink]
# 对于 MQ 类的 Sink,可以通过 dispatchers 配置 event 分发器
# 支持 partition 及 topic(从 v6.1 开始支持)两种 event 分发器。二者的详细说明见下一节。
Expand Down
18 changes: 18 additions & 0 deletions ticdc/ticdc-sink-to-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,3 +233,21 @@ partition 分发器用 partition = "xxx" 来指定,支持 default、ts、index
> ```
> {matcher = ['*.*'], dispatcher = "ts", partition = "table"},
> ```
## 横向扩展大单表的负载到多个 TiCDC 节点
该功能通过将大单表按 Region 个数切分成多个数据范围,将这些数据范围分布到多个 TiCDC 节点上,使得多个 TiCDC 节点可以同时同步大单表。该功能可以解决以下两个问题:
- 单个 TiCDC 节点不能及时同步大单表。
- TiCDC 节点之间资源(CPU、内存等)消耗不均匀。
> **注意:**
>
> TiCDC v6.6.0 仅支持在 Kafka 同步任务上开启大单表的横向扩展功能。
配置样例如下所示:
```toml
[scheduler]
region-per-span = 50000
```

0 comments on commit 4b3c34f

Please sign in to comment.