-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lifo filters, concept #1030
lifo filters, concept #1030
Conversation
Great that you had the chance to work on this. I'm very excited to see it working. Looks great, just left some questions for my own understanding |
@lmineiro I pushed the author to get this PR or an origin branch to work on it next week. The state is an early draft and thanks for all the points I agree with the validation and default suggestion. |
3b00ea3
to
c696331
Compare
19a8b69
to
46133cc
Compare
TODO:
|
preparation
TLDRThe base load test was not able to increase memory significantly and did not interfere with the vegeta load test. The vegeta load test measured a backend which had no increased latency. dataskipper-ingress cluster metrics show the base 500req/s and added traffic from vegeta: application targeted by vegeta, without 25s latency metrics measured by skipper-ingress shows vegeta:
|
32ad2d6
to
1f10da5
Compare
Signed-off-by: Arpad Ryszka <arpad.ryszka@gmail.com>
Signed-off-by: Arpad Ryszka <arpad.ryszka@gmail.com>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…ame, which we need for having a filter that can be used as default per route Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…from -default-filters-prepend and -default-filters-append Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…nfig Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…rors Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
I rebased on master |
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@arjunrn I fixed all comments from your review. |
scheduler/scheduler.go
Outdated
s = newStack(c) | ||
r.setStack(key, s) | ||
} else if c != s.config { // UpdateDoc | ||
s.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit doesn't make sense to me. How are the currently pending requests in the old stack handled because calling close()
may quit the loop when there are requests which are still running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I'm referring to https://github.com/aryszka/jobqueue/blob/master/jobstack.go#L117
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is true, it only hits on filter config change, which we do not want to do.
Maybe a documentation issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test with some logs, that are not part of the PR (logs in UTC):
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 101 101 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 101 101 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="diff taken, inserts/updates: 2, deletes: 0"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings, update, route: kube_default__sszuecs_demo_kiac_5__c5_k
iac_: Host(/^c5[.]kiac-test[.]...$/) -> lifo(100, 100, \"8s\") -> <roundRobin, \"http://10.2.10.26:9090\", \"http://10.2.16.82:9090\", \"http://10.2.17.92:9090\", \"http://10.2.20.43:9090\", \"http://10.2.4.252:9090\">"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings, update, route: kubeew_default__sszuecs_demo_kiac_5__c5_kiac_test: Host(/^sszuecs-demo-kiac-5[.]default[.]ingress[.]cluster[.]local$/) -> lifo(100, 100, \"8s\") -> <roundRobin, \"http://10.2.10.26:9090\", \"http://10.2.16.82:9090\", \"http://10.2.17.92:9090\", \"http://10.2.20.43:9090\", \"http://10.2.4.252:9090\">"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings received"
[APP]time="2019-05-03T11:48:47Z" level=info msg="filterRoutes incoming=63 outgoing=63"
[APP]time="2019-05-03T11:48:47Z" level=info msg="set config to: { 100 100 8s}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="set config to: { 100 100 8s}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings applied"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"
and the loadtest that does not show any errors from the client:
# test 250 req/s Fri May 3 13:45:56 CEST 2019
Latencies [mean, 50, 95, 99, max] 19.840703ms, 17.373569ms, 19.444199ms, 133.260754ms, 228.299662ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 17.769876ms, 17.132646ms, 18.611872ms, 29.94731ms, 108.44734ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 17.977478ms, 17.324209ms, 18.770338ms, 27.872741ms, 240.116582ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 17.828503ms, 17.265959ms, 18.746125ms, 28.884823ms, 104.983894ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 17.803654ms, 17.218953ms, 18.360783ms, 32.098065ms, 106.948886ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 17.717137ms, 17.186123ms, 18.371447ms, 23.453019ms, 102.757632ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 18.45423ms, 17.838747ms, 19.548307ms, 35.793498ms, 102.979901ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 19.991598ms, 18.063765ms, 19.874679ms, 113.721727ms, 199.066109ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 18.695232ms, 18.02269ms, 19.146417ms, 39.549053ms, 109.470087ms
Status Codes [code:count] 200:2500
# end test 250 req/s Fri May 3 13:47:26 CEST 2019
# test 250 req/s Fri May 3 13:47:26 CEST 2019
Latencies [mean, 50, 95, 99, max] 18.6704ms, 18.099258ms, 19.659676ms, 30.7674ms, 103.966163ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 19.030097ms, 18.089099ms, 20.179344ms, 52.046475ms, 230.69458ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 18.789092ms, 18.029326ms, 19.522121ms, 48.415269ms, 102.060311ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 18.600886ms, 18.050402ms, 19.039842ms, 26.858217ms, 105.012759ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 18.369626ms, 17.674826ms, 19.609541ms, 39.967571ms, 105.498709ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 19.221207ms, 17.664274ms, 27.470963ms, 48.486916ms, 233.945102ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 20.205234ms, 17.774965ms, 27.417457ms, 94.604069ms, 187.215304ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 19.629634ms, 17.786002ms, 30.27717ms, 38.633115ms, 102.99997ms
Status Codes [code:count] 200:2500
Latencies [mean, 50, 95, 99, max] 20.363497ms, 17.770527ms, 28.553207ms, 87.269158ms, 241.359971ms
Status Codes [code:count] 200:2500
# end test 250 req/s Fri May 3 13:48:56 CEST 2019
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de> delegate MoveTo to stack from jobqueue Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
👍 |
1 similar comment
👍 |
[WIP]
This PR integrates this jobstack solution to provide control over the maximum concurrency that skipper is allowed to handle, primarily to protect against chatty clients, and slow backends as noisy neighbors. It's a form of load shedding.
This PR is currently only a concept.
The plan is to support the following independent alternatives that allow:
In a second phase, shared counters across the nodes in a Skipper cluster will be also considered.
Signed-off-by: Arpad Ryszka arpad.ryszka@gmail.com