Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: keep using origin goroutine when doing retry in commit phase #16849

Merged
merged 4 commits into from
Apr 28, 2020
Merged

tikv: keep using origin goroutine when doing retry in commit phase #16849

merged 4 commits into from
Apr 28, 2020

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Apr 26, 2020

What problem does this PR solve?

Issue Number: close #15995

Problem Summary:

make goroutine count in retry commit-phase can be controlled by committer-concurrency

What is changed and how it works?

Proposal: xxx

What's Changed:

no longer fork new goroutine when retry

How it Works:

no longer fork new goroutine when retry

Related changes

  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test, see commments above

Side effects

  • n/a

Release note

make goroutine count in retry commit-phase can be controlled by committer-concurrency


This change is Reviewable

@lysu lysu added status/WIP sig/transaction SIG:Transaction labels Apr 26, 2020
@lysu
Copy link
Contributor Author

lysu commented Apr 26, 2020

/run-all-tests

@lysu
Copy link
Contributor Author

lysu commented Apr 26, 2020

/run-mybatis-test
/run-common-test

@codecov
Copy link

codecov bot commented Apr 26, 2020

Codecov Report

Merging #16849 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #16849   +/-   ##
===========================================
  Coverage   80.5357%   80.5357%           
===========================================
  Files           507        507           
  Lines        138793     138793           
===========================================
  Hits         111778     111778           
  Misses        18337      18337           
  Partials       8678       8678           

@lysu
Copy link
Contributor Author

lysu commented Apr 27, 2020

image

in large-txn test:
we see master version(12:40-14:00)'s goroutine count increase to 30k quickly and couldn't finish
in this-pr version(14:00-15:16)'s goroutine count keep low and can finish test

@lysu lysu requested a review from tiancaiamao April 27, 2020 07:25
@lysu lysu requested review from cfzjywxk, coocood and jackysp April 27, 2020 07:26
@lysu lysu added the require-LGT3 Indicates that the PR requires three LGTM. label Apr 27, 2020
@lysu lysu added this to the v4.0.0-ga milestone Apr 27, 2020
@lysu lysu added type/bugfix This PR fixes a bug. needs-cherry-pick-4.0 labels Apr 27, 2020
@@ -46,11 +46,12 @@ import (
type twoPhaseCommitAction interface {
handleSingleBatch(*twoPhaseCommitter, *Backoffer, batchMutations) error
tiKVTxnRegionsNumHistogram() prometheus.Observer
inRetry() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inRetryCommit to be exact?
Maybe we don't need to add this method to the interface.

@@ -527,7 +548,7 @@ func (c *twoPhaseCommitter) doActionOnBatches(bo *Backoffer, action twoPhaseComm
return nil
}

if len(batches) == 1 {
if len(batches) == 1 || action.inRetry() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only handle the first batch?
Other batches are left uncommitted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orz, I write a bug

@lysu
Copy link
Contributor Author

lysu commented Apr 27, 2020

/run-all-tests

@coocood
Copy link
Member

coocood commented Apr 27, 2020

LGTM

1 similar comment
@tiancaiamao
Copy link
Contributor

LGTM

@tiancaiamao tiancaiamao added the status/LGT2 Indicates that a PR has LGTM 2. label Apr 27, 2020
Copy link
Contributor

@cfzjywxk cfzjywxk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lysu
Copy link
Contributor Author

lysu commented Apr 27, 2020

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 27, 2020
@lysu lysu removed the status/LGT2 Indicates that a PR has LGTM 2. label Apr 27, 2020
@lysu lysu added the status/LGT3 The PR has already had 3 LGTM. label Apr 27, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Apr 27, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Apr 27, 2020

@lysu merge failed.

@lysu
Copy link
Contributor Author

lysu commented Apr 27, 2020

/run-integration-copr-test

@lysu lysu merged commit 318a455 into pingcap:master Apr 28, 2020
@lysu lysu deleted the dev-refine-commint-secondary-retry branch April 28, 2020 02:33
@sre-bot
Copy link
Contributor

sre-bot commented Apr 28, 2020

cherry pick to release-4.0 in PR #16876

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
require-LGT3 Indicates that the PR requires three LGTM. sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT3 The PR has already had 3 LGTM. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The goroutine count surge when region miss or TiKV server is busy
5 participants