Closed
Description
YugabyteDB CDC is built to support the following use cases and motivations. This ticket captures the overall progress of PG compatibility work to support publication, replication slot APIs.
Motivation:
PostgreSQL compatibility:
- PostgreSQL has a huge community that needs a PG-compatible API to set up and consume database changes.
- Offer a complete set of PG community connectors for building and managing secure, clean data pipelines, supporting real-time data integrations, and ETL migrations.
Use cases:
-
Enable microservice-oriented architectures to subscribe to changes:
- Message buses like Kafka, Google PubSub, AWS Kinesis, etc, are likely for microservices.
- A search system powered by a service such as Elasticsearch may be used in conjunction with the database that stores the transactions
- Websocket-based consumption through the HTTP endpoint
-
Downstream data warehousing:
- Write to data warehouses like Snowflake, RedShift, Google BigQuery, etc - for downstream analytics.
- Generically write to S3 as parquet/JSON/CSV files
Jira Link: DB-7623
Phase 1
Status | Feature | GitHub Issue | Comments |
---|---|---|---|
✅ | CREATE PUBLICATION adds a new publication to the database. | #18930 | A publication is essentially a group of tables whose data changes are intended to be replicated through CDC |
✅ | DROP PUBLICATION removes an existing publication from the database. | #18931 | A publication can only be dropped by its owner or a superuser. |
✅ | ALTER PUBLICATION can change the attributes of a publication. | #18933 | Allows to change tables/schemas, publication properties, and the owner and the name of the publication. |
✅ | Log a notice for unsupported tables in CreatePublication FOR ALL TABLES |
#19291 |
Phase 2
Status | Feature | GitHub Issue | Comments |
---|---|---|---|
✅ | Support the CREATE_REPLICATION_SLOT command to create a logical replication slot. | #19211 | Replication slots provide an automated way to ensure that YugabyteDB does not remove WAL segments until they have been received by CDC subscribers |
✅ | Support the DROP_REPLICATION_SLOT command to drop a logical replication slot. | #19212 | |
✅ | Upgrade path for Publication/Replication Slot model | #19261 | Will allow migrating existing CDC streams to work with Publication/Replication Slot APIs |
✅ | Ensure Publication/Replication Slot APIs are atomic | #18934 | Creation of a Publication/Replication slot involves communication between YSQL layer and YB-master. The CRUD operations should be atomic across the processes. |
✅ | Migrate all tests to use the API | #19599 | |
✅ | Update the YB CDC connector to support Publication/Replication slot | #19811 |
Phase 3
Status | Feature | GitHub Issue | Comments |
---|---|---|---|
⬜️ | Support streaming a subset of insert/update/delete/truncate operations | #19250 | Allows choosing which operations to stream via the Publication |
✅ | Support consuming changes from ReplicationSlot. | #19441 | Subscribers can initiate a replication connection to consume changes via a ReplicationSlot |
⬜️ | Observability features. | #18932 | Will provide visibility into the stream progress |
⬜️ | Support creating temporary Replication Slot. | #19263 | A temporary replication slot automatically gets deleted upon error or end of session. |
Activity