-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disttask: fix subtask finished immediately and mark success when encountering network partition #48660
disttask: fix subtask finished immediately and mark success when encountering network partition #48660
Conversation
Hi @ywqzzy. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…task_finish_immediately
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #48660 +/- ##
================================================
+ Coverage 71.0824% 72.6570% +1.5745%
================================================
Files 1365 1389 +24
Lines 404163 410899 +6736
================================================
+ Hits 287289 298547 +11258
+ Misses 96925 93463 -3462
+ Partials 19949 18889 -1060
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@@ -216,15 +218,19 @@ func (s *BaseScheduler) run(ctx context.Context, task *proto.Task) (resErr error | |||
if err := s.getError(); err != nil { | |||
break | |||
} | |||
if ctx.Err() != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runCtx
is the cancelled ctx on startCancelCheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -387,7 +393,7 @@ func (s *BaseScheduler) Rollback(ctx context.Context, task *proto.Task) error { | |||
|
|||
// We should cancel all subtasks before rolling back | |||
for { | |||
subtask, err := s.taskTable.GetFirstSubtaskInStates(s.id, task.ID, task.Step, | |||
subtask, err := s.taskTable.GetFirstSubtaskInStates(ctx, s.id, task.ID, task.Step, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use rollbackCtx
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
rollbackCtx
?
Not neccessary? Since the ctx is used right before rollbackctx?But ok to change it.
@@ -610,27 +615,29 @@ func (s *BaseScheduler) markSubTaskCanceledOrFailed(ctx context.Context, subtask | |||
err := errors.Cause(err) | |||
if ctx.Err() != nil && context.Cause(ctx) == ErrCancelSubtask { | |||
logutil.Logger(s.logCtx).Warn("subtask canceled", zap.Error(err)) | |||
s.updateSubtaskStateAndError(subtask, proto.TaskStateCanceled, nil) | |||
updateCtx := util.WithInternalSourceType(context.Background(), kv.InternalDistTask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tidb might not be able to shutdown gracefully if we block here
deadline might not work in all case, maybe we can use parent ctx, ok for now,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
plz fix comments before merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ci failed
|
fixed |
Working on the ci fail |
/lgtm |
@ywqzzy: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: D3Hunter, GMHDBJD, ywqzzy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In response to a cherrypick label: new pull request created to branch |
…untering network partition (pingcap#48660) ref pingcap#46258, close pingcap/tidb#48649
What problem does this PR solve?
Issue Number: close #48649 ref #46258
Problem Summary:
see the comments in issue.
What is changed and how it works?
Most of the code are editing tests.
Main logic modifications only occur in task_table.go and scheduler.go
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.