Skip to content

Use Leader Epoch rather than High Watermark for Truncation #673

@swuferhong

Description

@swuferhong

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Use leader epoch rather than high watermark for log truncation to avoid inconsistent or lost data when the leader or follower goes offline abnormally.

Currently, the synchronization of fluss highWatermark involves first updating the leader and then updating the follower in the next round. Log recovery also depends on the highWatermark in the leader's local checkpoint cache to determine whether to truncate data. However, this approach poses a significant risk of data loss. In issue #674 , we will modify the highWatermark synchronization way to update the follower first and then the leader in the next round. While this change can solve the data loss problem caused by server abnormal exits, it may introduce new problem of data duplication. Therefore, in this issue, we aim to adopt the approach from Kafka KIP-101 to resolve this problem through supporting leader epoch cache to avoid Inconsistent or lost data when the leader or follower goes offline abnormally.

Solution

No response

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Sub-issues

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions