Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/client: fix force reconnect in client v2 #1682

Merged
merged 11 commits into from
Apr 28, 2021

Conversation

amyangfei
Copy link
Contributor

@amyangfei amyangfei commented Apr 20, 2021

What problem does this PR solve?

  • Fix a potential deadlock when a reconnect is performed
  • Fix un initialized regions are not reconnected.

note release-5.0 should be picked manually.

What is changed and how it works?

  • Reuse the resolvedTsManager to manage the time of last event in each un-initialized region, rename resolvedTsManager to regionTsManager
  • Separate the reconnect check of un-initialized regions from the lock resolver routine to reduce lock contention.

Check List

Tests

  • Unit test
  • Integration test

Release note

  • No release note

@amyangfei amyangfei added component/kv-client TiKV kv log client component. needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. labels Apr 20, 2021
@ti-chi-bot ti-chi-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 20, 2021
@amyangfei amyangfei added this to the v5.0.2 milestone Apr 20, 2021
@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-leak-tests

@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-leak-tests

default:
// rtsUpdateCh block often means too many regions are suffering
// lock resolve, the kv client status is not very healthy.
log.Warn("region is not upsert into rts manager", zap.Uint64("region-id", regionID))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "upsert"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a special syntax in some RDBMS, which means insert or update, ref: https://en.wikipedia.org/wiki/Merge_(SQL)

@amyangfei
Copy link
Contributor Author

/run-all-tests

1 similar comment
@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-integration-tests

@amyangfei amyangfei added status/ptal Could you please take a look? and removed status/wip labels Apr 21, 2021
@liuzix
Copy link
Contributor

liuzix commented Apr 21, 2021

What will happen if TiKV pushes an old resolved ts (the resolved ts for the incremental scan) immediately after the Initialize event? This will be the case if you use TiKV master branch.

@amyangfei
Copy link
Contributor Author

amyangfei commented Apr 22, 2021

What will happen if TiKV pushes an old resolved ts (the resolved ts for the incremental scan) immediately after the Initialize event? This will be the case if you use TiKV master branch.

I'm afraid there do exist some corner cases

  1. TiKV keeps sending unchanged resolved ts, kv client v1 can't detect it now, since kv client v1 checks last receive event time only. But kv client v2 can detect it because kv client checks lastResovledTs of initialized region
  2. TiKV pushes an old resolved ts after the initialized event. kv client v1 has not false positive, yet kv client v2 could reconnect region

From 1 we know use the last receive event time is not enough, while use lastResolvedTs check mechanism could get false positive.

@amyangfei
Copy link
Contributor Author

What will happen if TiKV pushes an old resolved ts (the resolved ts for the incremental scan) immediately after the Initialize event? This will be the case if you use TiKV master branch.

Besides, Does the old resolved ts equal to the start ts of this region? @liuzix

@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 22, 2021
@amyangfei
Copy link
Contributor Author

What will happen if TiKV pushes an old resolved ts (the resolved ts for the incremental scan) immediately after the Initialize event? This will be the case if you use TiKV master branch.

I'm afraid there do exist some corner cases

  1. TiKV keeps sending unchanged resolved ts, kv client v1 can't detect it now, since kv client v1 checks last receive event time only. But kv client v2 can detect it because kv client checks lastResovledTs of initialized region
  2. TiKV pushes an old resolved ts after the initialized event. kv client v1 has not false positive, yet kv client v2 could reconnect region

From 1 we know use the last receive event time is not enough, while use lastResolvedTs check mechanism could get false positive.

Also add event time in resovled ts item and solve problem 2, the problem 1 exists now. This PR will not solve unchanged resolved ts problem

@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-all-tests

@amyangfei
Copy link
Contributor Author

/run-unit-tests

@liuzix
Copy link
Contributor

liuzix commented Apr 26, 2021

/lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 26, 2021
@amyangfei
Copy link
Contributor Author

/run-all-tests

@codecov-commenter
Copy link

codecov-commenter commented Apr 27, 2021

Codecov Report

Merging #1682 (0f57019) into master (1092578) will increase coverage by 0.1465%.
The diff coverage is 88.1188%.

@@               Coverage Diff                @@
##             master      #1682        +/-   ##
================================================
+ Coverage   53.8313%   53.9778%   +0.1465%     
================================================
  Files           153        153                
  Lines         16117      16177        +60     
================================================
+ Hits           8676       8732        +56     
- Misses         6529       6531         +2     
- Partials        912        914         +2     

@zier-one
Copy link
Contributor

/lgtm

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • leoppro
  • liuzix

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 28, 2021
@zier-one
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 2dfe202

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 28, 2021
@ti-chi-bot ti-chi-bot merged commit f305861 into pingcap:master Apr 28, 2021
ti-srebot pushed a commit to ti-srebot/ticdc that referenced this pull request Apr 28, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #1721

amyangfei added a commit to ti-srebot/ticdc that referenced this pull request Apr 30, 2021
@Rustin170506
Copy link
Member

/cherry-pick release-5.0

ti-chi-bot pushed a commit to ti-chi-bot/tiflow that referenced this pull request May 6, 2021
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

@hi-rustin: new pull request created: #1734.

In response to this:

/cherry-pick release-5.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kv-client TiKV kv log client component. needs-cherry-pick-release-4.0 Should cherry pick this PR to release-4.0 branch. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants