Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matching: support for task list partitions #2229

Merged
merged 9 commits into from
Jul 26, 2019
Merged

Conversation

venkat1109
Copy link
Contributor

Fixes #2098

@venkat1109 venkat1109 force-pushed the v_stl_6 branch 3 times, most recently from ae70cda to b1b4dcc Compare July 23, 2019 19:43
@venkat1109 venkat1109 self-assigned this Jul 23, 2019
@venkat1109 venkat1109 marked this pull request as ready for review July 24, 2019 17:06
client/matching/lb.go Outdated Show resolved Hide resolved
client/matching/lb.go Outdated Show resolved Hide resolved
client/matching/lb.go Outdated Show resolved Hide resolved
client/matching/lb.go Outdated Show resolved Hide resolved
nPartitions dynamicconfig.IntPropertyFnWithTaskListInfoFilters,
) string {

if forwardedFrom != "" || taskList.GetKind() == shared.TaskListKindSticky {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes only one level of forwarding is possible currently, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, multiple levels of forwarding is ok. May be a comment might help here. When forwardedFrom is set, LB doesn't pick a partition because partition is already picked by the caller

service/matching/forwarder.go Outdated Show resolved Hide resolved
service/matching/matcher.go Show resolved Hide resolved
service/matching/matcher.go Show resolved Hide resolved
client/clientfactory.go Outdated Show resolved Hide resolved
service/matching/forwarder.go Outdated Show resolved Hide resolved
client/matching/lb.go Outdated Show resolved Hide resolved
service/matching/matcher.go Show resolved Hide resolved
service/matching/matcher.go Outdated Show resolved Hide resolved
service/matching/matcher.go Outdated Show resolved Hide resolved
service/matching/matcher.go Show resolved Hide resolved
client/matching/lb.go Outdated Show resolved Hide resolved
defaultLoadBalancer struct {
nReadPartitions dynamicconfig.IntPropertyFnWithTaskListInfoFilters
nWritePartitions dynamicconfig.IntPropertyFnWithTaskListInfoFilters
domainIDToName func(string) (string, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domainIDToName DomainIDToNameFunc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would introduce a circular dependency. On the other hand, I can move the type definition of DomainIDToNameFunc() within this package. But that would be weird for the caller- who interacts only with client package

// NewLoadBalancer returns an instance of matching load balancer that
// can help distribute api calls across task list partitions
func NewLoadBalancer(
domainIDToName func(string) (string, error),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

domainIDToName DomainIDToNameFunc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

case tm.queryTaskC <- task:
<-task.responseC
return nil, nil
case token := <-fwdrTokenC:
Copy link
Contributor

@wxing1292 wxing1292 Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within this case should do another attempt to send task to tm.queryTaskC before forwarding?

select {
case tm.taskC <- task:
return nil
case token := <-tm.fwdrAddReqTokenC():
Copy link
Contributor

@wxing1292 wxing1292 Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within this case should do another attempt to send task to tm.taskC before forwarding?

@@ -148,12 +246,27 @@ func (tm *TaskMatcher) Poll(ctx context.Context) (*internalTask, error) {
case <-ctx.Done():
tm.scope().IncCounter(metrics.PollTimeoutCounter)
return nil, ErrNoTasks
case token := <-tm.fwdrPollReqTokenC():
if task, err := tm.fwdr.ForwardPoll(ctx); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before actually polling from root / remote, maybe another attempt of polling from tm.taskC and tm.queryTaskC?

Copy link
Contributor

@wxing1292 wxing1292 Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, i think polling from local should have higher priority, event if task manager has token to poll from remote, same applies to add tasks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider this and then decided to drop this idea for the first round and here is the reasoning:

Before entering the select{} block which has both local task channels and forwarderTokenC, we always first exclusively attempt to match only with local task channels i.e. we always prioritize local pollers in the first select{} block before entering the second. In a busy task list, the first select{} should almost always match against a local poller. On the other hand, when its not matched, the task list probably isn't all that busy and its ok to forward. Having said that, my plan is to observe the forwardedFrom / matching metrics from a real stress test and use that to decide if adding this additional nested select{} is worth it. Till then, I prefer to start simple.

@@ -162,19 +275,79 @@ func (tm *TaskMatcher) PollForQuery(ctx context.Context) (*internalTask, error)
case <-ctx.Done():
tm.scope().IncCounter(metrics.PollTimeoutCounter)
return nil, ErrNoTasks
case token := <-tm.fwdrPollReqTokenC():
if task, err := tm.fwdr.ForwardPoll(ctx); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

matching: high throughput task list
3 participants