Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sorter: Unified Sorter clean up stale temporary files #1663

Merged
merged 19 commits into from
May 8, 2021

Conversation

liuzix
Copy link
Contributor

@liuzix liuzix commented Apr 14, 2021

What problem does this PR solve?

  • Unified Sorter is not able to clean up its temporary files if the process exits abnormally. If the process OOMs frequently, disk space can be exhausted and will require a manual clean-up.

What is changed and how it works?

  • File lock-based stale file detection and clean up.
  • Prevents multiple cdc server instances from sharing the same sort-dir.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has persistent data change

Side effects

  • Increased code complexity

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to modify TiDB operator to make TiCDC a stateful set.

Release note

  • Add stale temporary files clean-up in Unified Sorter, and forbids sharing sort-dir.

@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 14, 2021
@ti-chi-bot ti-chi-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 14, 2021
@liuzix
Copy link
Contributor Author

liuzix commented Apr 14, 2021

/run-all-tests

4 similar comments
@liuzix
Copy link
Contributor Author

liuzix commented Apr 14, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 14, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 14, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 15, 2021

/run-all-tests

@ti-chi-bot ti-chi-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 17, 2021
@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

2 similar comments
@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

@liuzix liuzix marked this pull request as ready for review April 19, 2021 07:59
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 19, 2021
@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

4 similar comments
@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 19, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 20, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented Apr 21, 2021

/run-all-tests

@liuzix liuzix added the status/ptal Could you please take a look? label Apr 23, 2021
@liuzix liuzix modified the milestones: v5.0.2, v4.0.13 Apr 23, 2021
pkg/filelock/filelock.go Show resolved Hide resolved
pkg/filelock/simple_filelock.go Outdated Show resolved Hide resolved
pkg/filelock/simple_filelock.go Outdated Show resolved Hide resolved
pkg/filelock/simple_filelock.go Outdated Show resolved Hide resolved
pkg/filelock/simple_filelock.go Outdated Show resolved Hide resolved
cdc/puller/sorter/backend_pool.go Outdated Show resolved Hide resolved
@amyangfei amyangfei modified the milestones: v4.0.13, v5.0.2 Apr 26, 2021
cdc/puller/sorter/backend_pool.go Outdated Show resolved Hide resolved
@@ -165,13 +214,20 @@ func (r *fileBackEndReader) readNext() (*model.PolymorphicEvent, error) {
if err != nil {
if err == io.EOF {
r.isEOF = true
// verifies that the file has not been truncated unexpectedly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the truncate here only happens by accident, not by cdc itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDC will truncate a file once its content is no longer needed and the file is going to be recycled. Normally a file in use is not truncated. This piece of code here verifies against accidental truncation either by CDC bug or by user misoperations.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2021
…ix/ticdc into zixiong-unified-sort-clean-up-stale
@liuzix
Copy link
Contributor Author

liuzix commented May 7, 2021

/run-all-tests

@liuzix liuzix added needs-cherry-pick-release-5.0 Should cherry pick this PR to release-5.0 branch. needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. component/puller Puller component. labels May 7, 2021
@liuzix
Copy link
Contributor Author

liuzix commented May 7, 2021

/run-all-tests

@liuzix
Copy link
Contributor Author

liuzix commented May 7, 2021

/run-all-tests

@amyangfei
Copy link
Contributor

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • amyangfei
  • overvenus

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 7, 2021
@amyangfei
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 932569f

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 7, 2021
@liuzix
Copy link
Contributor Author

liuzix commented May 7, 2021

/merge

1 similar comment
@liuzix
Copy link
Contributor Author

liuzix commented May 8, 2021

/merge

@codecov-commenter
Copy link

Codecov Report

Merging #1663 (ebb3776) into master (b190690) will decrease coverage by 0.0400%.
The diff coverage is 48.2014%.

@@               Coverage Diff                @@
##             master      #1663        +/-   ##
================================================
- Coverage   54.0050%   53.9649%   -0.0401%     
================================================
  Files           154        155         +1     
  Lines         16317      16444       +127     
================================================
+ Hits           8812       8874        +62     
- Misses         6591       6630        +39     
- Partials        914        940        +26     

@ti-chi-bot ti-chi-bot merged commit 5c0c1e2 into pingcap:master May 8, 2021
ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 8, 2021
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #1741.

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 8, 2021
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #1742.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/puller Puller component. needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. needs-cherry-pick-release-5.0 Should cherry pick this PR to release-5.0 branch. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants