Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redo(ticdc): fix resolved moves too fast when part of tables are not maintained redo writer #5587

Merged
merged 7 commits into from
May 27, 2022

Conversation

amyangfei
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #5486

What is changed and how it works?

redo manager queries redo log writer to update table resolved ts, but when part of tables are not maintained in redo log writer, the redo manager doesn't respect the resolved ts of these tables and could move resolved ts forward too fast, which leads to data loss.

Check List

Tests

  • Unit test
  • Integration test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Fix a bug that resolved ts moves too fast when part of tables are not maintained redo writer.

@amyangfei amyangfei added type/bugfix This PR fixes a bug. component/redolog area/ticdc Issues or PRs related to TiCDC. labels May 25, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented May 25, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • CharlesCheung96
  • hi-rustin

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-cherry-pick-release-5.3 Should cherry pick this PR to release-5.3 branch. needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. and removed do-not-merge/needs-triage-completed labels May 25, 2022
@amyangfei
Copy link
Contributor Author

/run-check-issue-triage-complete

@amyangfei amyangfei added the status/ptal Could you please take a look? label May 25, 2022
@amyangfei amyangfei force-pushed the fix-redo-log-advance-too-fast branch from 463fbf8 to a6ecc1f Compare May 25, 2022 08:11
@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-verify

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2022

Codecov Report

Merging #5587 (84e98a0) into master (fcea4d5) will increase coverage by 0.1928%.
The diff coverage is 59.9521%.

Flag Coverage Δ
cdc 61.6308% <62.2053%> (+0.4827%) ⬆️
dm 52.0619% <54.4329%> (+0.0192%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #5587        +/-   ##
================================================
+ Coverage   56.0768%   56.2697%   +0.1928%     
================================================
  Files           535        528         -7     
  Lines         70143      70027       -116     
================================================
+ Hits          39334      39404        +70     
+ Misses        27078      26898       -180     
+ Partials       3731       3725         -6     

for tableID, rts := range rtsMap {
m.rtsMap[tableID] = rts
for tableID := range m.rtsMap {
if rts, ok := rtsMap[tableID]; ok {
Copy link
Contributor

@nongfushanquan nongfushanquan May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why part of tables are not maintained in redo log writer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no data of this table, and flush resolved ts of this table has not been executed, the redo log writer doesn't know this table.
The redo log writer only records the information of the table that it has met.

@ti-chi-bot
Copy link
Member

@nongfushanquan: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 26, 2022
@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 27, 2022
@Rustin170506
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: a6ecc1f

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 27, 2022
@Rustin170506 Rustin170506 changed the title redo(cdc): fix resolved moves too fast when part of tables are not maintained redo writer redo(ticdc): fix resolved moves too fast when part of tables are not maintained redo writer May 27, 2022
@amyangfei
Copy link
Contributor Author

/merge

@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-integration-test

@amyangfei
Copy link
Contributor Author

/run-verify

@amyangfei
Copy link
Contributor Author

/run-verify

2 similar comments
@CharlesCheung96
Copy link
Contributor

/run-verify

@amyangfei
Copy link
Contributor Author

/run-verify

@ti-chi-bot
Copy link
Member

@amyangfei: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #5617.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #5618.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #5619.

ti-chi-bot added a commit that referenced this pull request May 27, 2022
ti-chi-bot added a commit that referenced this pull request Jun 15, 2022
ti-chi-bot added a commit that referenced this pull request Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. component/redolog needs-cherry-pick-release-5.3 Should cherry pick this PR to release-5.3 branch. needs-cherry-pick-release-5.4 Should cherry pick this PR to release-5.4 branch. needs-cherry-pick-release-6.1 Should cherry pick this PR to release-6.1 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look? type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redo log: data was lost or damaged in some test cases, and sometimes changefeed failed: "redo log flush fail"
6 participants