Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gc_worker: fix serveral bugs of green gc #16413

Merged
merged 15 commits into from
Apr 30, 2020

Conversation

youjiali1995
Copy link
Contributor

Signed-off-by: youjiali1995 zlwgx1023@gmail.com

What problem does this PR solve?

Issue Number: close #16306

Problem Summary:

  1. GC worker doesn't re-register lock observers of dirty stores which causes these stores are always dirty.
  2. GC worker doesn't register lock observers of newly added stores which causes these stores are always dirty.
  3. Can't remove lock observers of cleaned stores.
  4. If there are dirty stores after 3 round physical resolve lock, GC worker won't fall back to legacy procedure which breaks the data safety.

What is changed and how it works?

What's Changed:
It's simple. Please refer to the codes.

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Integration test

Side effects

Release note

Fix the issue that green GC may leave unresolved locks if there are dirty stores.

Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
Comment on lines +1082 to +1085
// The store is checked and has been resolved before.
if _, ok := dirtyStores[store]; !ok {
delete(stores, store)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved stores may become dirty again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It will be scanned and resolved again.

break
}
// Clear all collected locks of dirty stores to prevent it from being dirty again.
w.removeLockObservers(ctx, safePoint, dirtyStores)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found we can't do it. If the store recevies duplicated removeLockObserver and registerLockObserver between L1061 and L1071. The store is cleaned but the lock observer drops some locks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can't retry here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we shouldn't do "remove and re-register" on stores which fail to be checked. Instead, can we just register for those newly added stores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If dirty stores are newly added, we can retry. If a store is dirty due to too many collected locks, we don't retry and falls back to legacy mode.

@codecov
Copy link

codecov bot commented Apr 20, 2020

Codecov Report

Merging #16413 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #16413   +/-   ##
===========================================
  Coverage   80.4461%   80.4461%           
===========================================
  Files           507        507           
  Lines        137676     137676           
===========================================
  Hits         110755     110755           
  Misses        18241      18241           
  Partials       8680       8680           

Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
@youjiali1995 youjiali1995 force-pushed the fix-physical-resolve-lock branch 2 times, most recently from e4a7fc2 to 3d21cae Compare April 21, 2020 10:29
Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
@youjiali1995 youjiali1995 force-pushed the fix-physical-resolve-lock branch from 3d21cae to b6a606d Compare April 22, 2020 02:38
@youjiali1995
Copy link
Contributor Author

/run-all-tests

Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@youjiali1995 youjiali1995 modified the milestones: v4.0.0-rc.1, v4.0.0-ga Apr 30, 2020
MyonKeminta
MyonKeminta previously approved these changes Apr 30, 2020
@youjiali1995
Copy link
Contributor Author

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 30, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

@youjiali1995 merge failed.

@youjiali1995
Copy link
Contributor Author

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

Your auto merge job has been accepted, waiting for:

  • 16536

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

@youjiali1995 merge failed.

Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
@youjiali1995
Copy link
Contributor Author

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

@youjiali1995 merge failed.

@youjiali1995
Copy link
Contributor Author

/run-unit-test

@youjiali1995
Copy link
Contributor Author

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

/run-all-tests

@sre-bot sre-bot merged commit 139cb3e into pingcap:master Apr 30, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 30, 2020

cherry pick to release-4.0 in PR #16949

@youjiali1995 youjiali1995 deleted the fix-physical-resolve-lock branch April 30, 2020 05:41
youjiali1995 added a commit to sre-bot/tidb that referenced this pull request Apr 30, 2020
Signed-off-by: sre-bot <sre-bot@pingcap.com>
Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
jackysp pushed a commit that referenced this pull request May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/GC status/can-merge Indicates a PR has been approved by a committer. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The retry of physical resolve lock is useless
4 participants