Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss when upstream transcation conflicts during incremental scan #5468

Closed
overvenus opened this issue May 18, 2022 · 1 comment · Fixed by #5477
Closed

Data loss when upstream transcation conflicts during incremental scan #5468

overvenus opened this issue May 18, 2022 · 1 comment · Fixed by #5477

Comments

@overvenus
Copy link
Member

What did you do?

For UPDATE SQL, its prewrite event has both value and old value.
It is possible that TiDB prewrites multiple times for the same row when
there are other transcations it conflicts with. For this case,
if the value is not "short", only the first prewrite contains the value.

TiKV may output events for the UPDATE SQL as following:

 TiDB: [Prwrite1]    [Prewrite2]      [Commit]
       v             v                v                                   Time
 ---------------------------------------------------------------------------->
         ^            ^    ^           ^     ^       ^     ^          ^     ^
 TiKV:   [Scan Start] [Send Prewrite2] [Send Commit] [Send Prewrite1] [Send Init]
 TiCDC:                    [Recv Prewrite2]  [Recv Commit] [Recv Prewrite1] [Recv Init]

TiCDC mistakely outputs an event that contains the old value but not contains the value.

The event is translated into DELETE in sink, so the row is lost.

See line L718-L743

case cdcpb.Event_COMMIT:
w.metrics.metricPullEventCommitCounter.Inc()
if entry.CommitTs <= state.lastResolvedTs {
logPanic("The CommitTs must be greater than the resolvedTs",
zap.String("EventType", "COMMIT"),
zap.Uint64("CommitTs", entry.CommitTs),
zap.Uint64("resolvedTs", state.lastResolvedTs),
zap.Uint64("regionID", regionID))
return errUnreachable
}
ok := state.matcher.matchRow(entry)
if !ok {
if !state.initialized {
state.matcher.cacheCommitRow(entry)
continue
}
return cerror.ErrPrewriteNotMatch.GenWithStackByArgs(
hex.EncodeToString(entry.GetKey()),
entry.GetStartTs(), entry.GetCommitTs(),
entry.GetType(), entry.GetOpType())
}
revent, err := assembleRowEvent(regionID, entry)
if err != nil {
return errors.Trace(err)
}

What did you expect to see?

No data loss

What did you see instead?

Data is lost.

Versions of the cluster

TiCDC version (execute cdc version):

v5.0.1
@nongfushanquan
Copy link
Contributor

/assign overvenus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants