Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching support for cancelling outstanding pollers #353

Merged
merged 4 commits into from
Sep 23, 2017

Conversation

samarabbas
Copy link
Contributor

RPC stack used by Cadence does not support cancel propagation through
context and this results in tasks being dispatched to zombie pollers
when client connection is closed.
Updated mathcing engine api for PollForDecisionTask and
PollForActivityTask to alsoo take in pollerID. Also added new API on
matching engine to CancelOutstandingPolls given pollerID.
Frontend handler for Poll API generates pollerID before calling the
corresponding API on matching engine, if the call fails due to context
cancellation it calls the CancelOutstandingPoll.

Fixes Issue #254

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 66.434% when pulling bbf945c on samarabbas:zombie-poller into 3b1b45b on uber:master.

Copy link
Contributor

@venkat1109 venkat1109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just a couple of minor comments

PollRequest: pollRequest,
})
if err != nil {
// First check if this err is due to context cancellation. This means client connection to frontend is closed.
if ctx.Err() == context.Canceled {
// Our rpc stack does not propagates context cancellation to the other service. Lets make an explicit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this cancel logic into its own method ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


pollerID, ok := ctx.Value(pollerIDKey).(string)
childCtx, cancel := context.WithCancel(ctx)
if ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the else{} part is only for backward compatibility, this code can be refactored to remove the else entirely i.e. move the context.WithCancel(ctx) within the if ok branch and initialize childCtx = ctx outside the if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if ctx.Err() == context.Canceled {
// Our rpc stack does not propagates context cancellation to the other service. Lets make an explicit
// call to matching to notify this poller is gone to prevent any tasks being dispatched to zombie pollers.
err = wh.matching.CancelOutstandingPoll(nil, &m.CancelOutstandingPollRequest{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it standard to use nil as context if there is no context? or is it better to use context.Background()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated it to context.Background.

c.outstandingPollsLock.Unlock()
cancel()
}()
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we expect this to happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for backwards compatibility.

tCtx, err := e.getTask(ctx, taskList)
// Add frontend generated pollerID to context so tasklistMgr can support cancellation of
// long-poll when frontend calls CancelOutstandingPoll API
pollerCtx := context.WithValue(ctx, pollerIDKey, pollerID)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it possible that pollerID is empty (not supplied by frontend from older version or whatever reason)? Do we need to skip adding the empty pollerID to ctx if it is the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have added protection in tasklistmgr.gettask to protect against empty pollerID.

RPC stack used by Cadence does not support cancel propagation through
context and this results in tasks being dispatched to zombie pollers
when client connection is closed.
Updated mathcing engine api for PollForDecisionTask and
PollForActivityTask to alsoo take in pollerID.  Also added new API on
matching engine to CancelOutstandingPolls given pollerID.
Frontend handler for Poll API generates pollerID before calling the
corresponding API on matching engine, if the call fails due to context
cancellation it calls the CancelOutstandingPoll API on the matching
engine to unblock corresponding poll requests.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 66.477% when pulling 211a3a7 on samarabbas:zombie-poller into 3b1b45b on uber:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.4%) to 66.461% when pulling 211a3a7 on samarabbas:zombie-poller into 3b1b45b on uber:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 66.517% when pulling 211a3a7 on samarabbas:zombie-poller into 3b1b45b on uber:master.

@samarabbas samarabbas merged commit 27d483c into uber:master Sep 23, 2017
@samarabbas samarabbas deleted the zombie-poller branch September 23, 2017 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants