title | summary |
---|---|
Quick Start Guide on Integrating TiDB with Confluent Platform |
Learn how to stream TiDB data to the Confluent Platform using TiCDC. |
This document introduces how to integrate TiDB to Confluent Platform using TiCDC.
Warning:
This is still an experimental feature. Do NOT use it in a production environment.
Confluent Platform is a data streaming platform with Apache Kafka at its core. With many official and third-party sink connectors, Confluent Platform enables you to easily connect stream sources to relational or non-relational databases.
To integrate TiDB with Confluent Platform, you can use the TiCDC component with the Avro protocol. TiCDC can stream data changes to Kafka in the format that Confluent Platform recognizes. For the detailed integration guide, see the following sections:
Note:
In this tutorial, the JDBC sink connector is used to replicate TiDB data to a downstream relational database. To make it simple, SQLite is used here as an example.
-
Make sure that Zookeeper, Kafka, and Schema Registry are properly installed. It is recommended that you follow the Confluent Platform Quick Start Guide to deploy a local test environment.
-
Make sure that JDBC sink connector is installed by running the following command. The result should contain
jdbc-sink
.{{< copyable "shell-regular" >}}
confluent local services connect connector list
-
Save the following configuration into
jdbc-sink-connector.json
:{{< copyable "" >}}
{ "name": "jdbc-sink-connector", "config": { "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "tasks.max": "1", "topics": "testdb_test", "connection.url": "sqlite:test.db", "connection.ds.pool.size": 5, "table.name.format": "test", "auto.create": true, "auto.evolve": true } }
-
Create an instance of the JDBC sink connector by running the following command (assuming Kafka is listening on
127.0.0.1:8083
):{{< copyable "shell-regular" >}}
curl -X POST -H "Content-Type: application/json" -d jdbc-sink-connector.json http://127.0.0.1:8083/connectors
-
Deploy TiCDC in one of the following ways. If TiCDC is already deployed, you can skip this step.
- Deploy a new TiDB cluster that includes TiCDC using TiUP
- Add TiCDC to an existing TiDB cluster using TiUP
- Add TiCDC to an existing TiDB cluster using binary (not recommended)
Make sure that your TiDB and TiCDC clusters are healthy before proceeding.
-
Create a
changefeed
by running thecdc cli
command:{{< copyable "shell-regular" >}}
./cdc cli changefeed create --pd="http://127.0.0.1:2379" --sink-uri="kafka://127.0.0.1:9092/testdb_test?protocol=avro" --opts "registry=http://127.0.0.1:8081"
Note:
Make sure that PD, Kafka, and Schema Registry are running on their respective default ports.
After TiDB is integrated with Confluent Platform, you can follow the example procedures below to test the data replication.
-
Create the
testdb
database in your TiDB cluster:{{< copyable "sql" >}}
CREATE DATABASE IF NOT EXISTS testdb;
Create the
test
table intestdb
:{{< copyable "sql" >}}
USE testdb; CREATE TABLE test ( id INT PRIMARY KEY, v TEXT );
Note:
If you need to change the database name or the table name, change
topics
injdbc-sink-connector.json
accordingly. -
Insert data into TiDB:
{{< copyable "sql" >}}
INSERT INTO test (id, v) values (1, 'a'); INSERT INTO test (id, v) values (2, 'b'); INSERT INTO test (id, v) values (3, 'c'); INSERT INTO test (id, v) values (4, 'd');
-
Wait a moment for data to be replicated to the downstream. Then check the downstream for data:
{{< copyable "shell-regular" >}}
sqlite3 test.db sqlite> SELECT * from test;