Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite/improve basic load test #4399

Merged
merged 21 commits into from
Aug 27, 2021
Merged

Rewrite/improve basic load test #4399

merged 21 commits into from
Aug 27, 2021

Conversation

longquanzheng
Copy link
Collaborator

@longquanzheng longquanzheng commented Aug 20, 2021

What changed?
Rewrite basic load test

  1. Add more documentation
  2. Use activity retry to launch
  3. Use deterministic workflowID for stressWorkflow
  4. Improve verification

Why?

How did you test it?
Tested locally

(qlong-bench-imp) $cadence --do cadence-bench  wf ob -w 45edc243-9eb5-4774-b42f-196849759644
Progress:
  1, 2021-08-20T15:14:15-07:00, WorkflowExecutionStarted
  2, 2021-08-20T15:14:15-07:00, DecisionTaskScheduled
  3, 2021-08-20T15:14:15-07:00, DecisionTaskStarted
  4, 2021-08-20T15:14:15-07:00, DecisionTaskCompleted
  5, 2021-08-20T15:14:15-07:00, ActivityTaskScheduled
  6, 2021-08-20T15:14:15-07:00, ActivityTaskStarted
  7, 2021-08-20T15:14:25-07:00, ActivityTaskCompleted
  8, 2021-08-20T15:14:25-07:00, DecisionTaskScheduled
  9, 2021-08-20T15:14:25-07:00, DecisionTaskStarted
  10, 2021-08-20T15:14:25-07:00, DecisionTaskCompleted
  11, 2021-08-20T15:14:25-07:00, TimerStarted
  12, 2021-08-20T15:14:45-07:00, TimerFired
  13, 2021-08-20T15:14:45-07:00, DecisionTaskScheduled
  14, 2021-08-20T15:14:45-07:00, DecisionTaskStarted
  15, 2021-08-20T15:14:45-07:00, DecisionTaskCompleted
  16, 2021-08-20T15:14:45-07:00, ActivityTaskScheduled
  17, 2021-08-20T15:14:45-07:00, ActivityTaskStarted
  18, 2021-08-20T15:14:45-07:00, ActivityTaskCompleted
  19, 2021-08-20T15:14:45-07:00, DecisionTaskScheduled
  20, 2021-08-20T15:14:45-07:00, DecisionTaskStarted
  21, 2021-08-20T15:14:45-07:00, DecisionTaskCompleted
  22, 2021-08-20T15:14:45-07:00, WorkflowExecutionCompleted

Result:
  Run Time: 1 seconds
  Status: COMPLETED
  Output: "TEST PASSED: true; Details report: timeoutCount: 0, failedCount: 0, openCount:0, launchCount: 100, maxThreshold:1"

Potential risks

Release notes

Documentation Changes

@longquanzheng longquanzheng changed the title Rewrite basic load test Rewrite/improve basic load test Aug 20, 2021
@coveralls
Copy link

coveralls commented Aug 20, 2021

Pull Request Test Coverage Report for Build 8941da23-91f4-4b6b-a9fd-6019e635d004

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 37 unchanged lines in 8 files lost coverage.
  • Overall coverage decreased (-0.02%) to 56.401%

Files with Coverage Reduction New Missed Lines %
common/task/weightedRoundRobinTaskScheduler.go 1 89.64%
common/types/mapper/thrift/shared.go 2 64.98%
service/history/execution/mutable_state_builder.go 2 69.86%
service/history/queue/timer_queue_processor.go 2 58.77%
service/history/queue/timer_gate.go 3 95.83%
service/history/queue/timer_queue_processor_base.go 5 78.56%
service/history/execution/context.go 6 68.54%
common/persistence/sql/sqlExecutionStore.go 16 60.0%
Totals Coverage Status
Change from base Build 78be6221-be34-442c-8a27-540400fd920d: -0.02%
Covered Lines: 79085
Relevant Lines: 140219

💛 - Coveralls

@meiliang86 meiliang86 requested a review from a team August 23, 2021 05:02
bench/load/basic/launchWorkflow.go Show resolved Hide resolved
bench/load/basic/launchWorkflow.go Outdated Show resolved Hide resolved
bench/load/basic/launchWorkflow.go Show resolved Hide resolved
Copy link
Contributor

@yycptt yycptt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me.

if passed {
return finalResult, nil
}
return "", fmt.Errorf(finalResult)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may want to change the finalResult to start with TEST FAILED here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay sounds good.

return "", err
}
return result.String(), nil
passed := (result.TimeoutCount + result.OpenCount + result.FailedCount) <= int(maxTolerantFailure)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For open workflow, shall we fail the test as long as the open count is non-zero (same as the old behavior)? As it typically means workflow got stuck due to lost transfer/timer task and is a very important issue.
Another possibility is that the timer processing latency becomes so high and Cadence can't even timeout the workflows within the specified wait duration (that's why I used 5min as the wait time buffer instead of 10s), in this case failing the bench also sounds reasonable to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That makes sense. So the failure threshold should only be applied to timeout+failed.


if len(openWfs.Executions) > 0 {
opens = len(openWfs.Executions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this means at least # of opens workflows are still open and there may be more than that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is due to the limitation of basic visibility. There is no count API so we have to use a pageSize as the limit. The page size here is the threshold+1 should it should be the same accurate as using count API with advanced visibility.

@longquanzheng longquanzheng merged commit 0b98055 into master Aug 27, 2021
@longquanzheng longquanzheng deleted the qlong-bench-imp branch August 27, 2021 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants