-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite/improve basic load test #4399
Conversation
Pull Request Test Coverage Report for Build 8941da23-91f4-4b6b-a9fd-6019e635d004
💛 - Coveralls |
2aea8f0
to
54af362
Compare
d50a5bf
to
6eef524
Compare
6eef524
to
ec0cbde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
bench/load/basic/launchWorkflow.go
Outdated
if passed { | ||
return finalResult, nil | ||
} | ||
return "", fmt.Errorf(finalResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: may want to change the finalResult to start with TEST FAILED
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay sounds good.
bench/load/basic/launchWorkflow.go
Outdated
return "", err | ||
} | ||
return result.String(), nil | ||
passed := (result.TimeoutCount + result.OpenCount + result.FailedCount) <= int(maxTolerantFailure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For open workflow, shall we fail the test as long as the open count is non-zero (same as the old behavior)? As it typically means workflow got stuck due to lost transfer/timer task and is a very important issue.
Another possibility is that the timer processing latency becomes so high and Cadence can't even timeout the workflows within the specified wait duration (that's why I used 5min as the wait time buffer instead of 10s), in this case failing the bench also sounds reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That makes sense. So the failure threshold should only be applied to timeout+failed.
|
||
if len(openWfs.Executions) > 0 { | ||
opens = len(openWfs.Executions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this means at least # of opens
workflows are still open and there may be more than that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is due to the limitation of basic visibility. There is no count API so we have to use a pageSize as the limit. The page size here is the threshold+1 should it should be the same accurate as using count API with advanced visibility.
ec0cbde
to
0cdfbb7
Compare
0cdfbb7
to
0c23974
Compare
b6fa7b3
to
036106c
Compare
036106c
to
fbdc047
Compare
What changed?
Rewrite basic load test
Why?
How did you test it?
Tested locally
Potential risks
Release notes
Documentation Changes