Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance mqSink to support Low-latency/DDL-consistent/Txn-consistent modes #795

Open
Tracked by #2971
liuzix opened this issue Jul 27, 2020 · 2 comments
Open
Tracked by #2971
Assignees
Labels
component/sink Sink component. difficulty/hard Hard task.

Comments

@liuzix
Copy link
Contributor

liuzix commented Jul 27, 2020

Feature Request

Is your feature request related to a problem? Please describe:

  • Canal support is not DDL-consistent in the sense that a DDL might be reordered with DMLs.
  • In order to make Canal support DDL-consistent, we sacrifice latency because we need to wait until FlushRowChangedEvents is issued.
  • If we write transaction reconstruction logic in mqSink, code complexity will be increased, and because different encoders might want to handle transactions differently, letting mqSink construct the transactions hampers the decoupling of code.

Describe the feature you'd like:

  • DDL-consistent mode should be achieved by preventing the mqProducer from implicit flushes, i.e. write to the MQ without explicit Flush calls. Hence I propose writing a mqProducer decorator that can make any producer hold back sending data to network until Flush is called.
  • Txn-consistent mode should be supported directly by encoders, which should reconstruct the transactions internally. To this end, we should provide TxnGenerator that facilitates the reconstruction of transactions. TxnGenerator should support Append and Split methods. Changes in Refactor: Let the encoder control sink's output related operation #770 should be merged so that the encoder can control when it wants the sink to write to the MQ.

Current support/needs

Open Protocol Canal Avro
Low-latency Available Available Available
DDL-consistent No plan Needed to ensure correctness No need because there's no DDL output
Transaction-consistent Handled by consumer Planned No need because there's no semantics
@liuzix liuzix added component/sink Sink component. enhancement status/need-discussion Issue that needs to be discussed to confirm priority, milestone, plan and task breakdown. labels Jul 27, 2020
@liuzix liuzix self-assigned this Jul 27, 2020
@amyangfei
Copy link
Contributor

amyangfei commented Jul 27, 2020

Have one problem with the DDL operation in canal, if we dispatch DML data to multiple partitions, and DDL to the first partition, how does the consumer know when to consume DDL from the first partition. From another aspect, how does canal consumer know which DMLs are before one DDL, and which DMLs are after one DDL.
Btw, I found a similar scenario about DML, DDL consumption with multiple MQ partitions. alibaba/canal#2842

@zier-one
Copy link
Contributor

zier-one commented Jul 27, 2020

also, see https://github.com/pingcap/ticdc/pull/708/files#diff-13ef8e2f0f85e13cd6f76c33fa8d982dR28,

there is a simple abstraction and implement about the transaction-consistent controller

@amyangfei amyangfei added the difficulty/hard Hard task. label Jul 29, 2020
@amyangfei amyangfei removed the status/need-discussion Issue that needs to be discussed to confirm priority, milestone, plan and task breakdown. label Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/sink Sink component. difficulty/hard Hard task.
Projects
None yet
Development

No branches or pull requests

4 participants