Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lifo filters, concept #1030

Merged
merged 33 commits into from
May 3, 2019
Merged

lifo filters, concept #1030

merged 33 commits into from
May 3, 2019

Conversation

aryszka
Copy link
Contributor

@aryszka aryszka commented Apr 11, 2019

[WIP]

This PR integrates this jobstack solution to provide control over the maximum concurrency that skipper is allowed to handle, primarily to protect against chatty clients, and slow backends as noisy neighbors. It's a form of load shedding.

This PR is currently only a concept.

The plan is to support the following independent alternatives that allow:

  • filters to define route specific LIFO stacks with custom configuration
  • filters to assign a route to a LIFO group with a shared stack with globally defined configuration
  • protect the entire process with a global, more permissive LIFO stack, as a last resort

In a second phase, shared counters across the nodes in a Skipper cluster will be also considered.

Signed-off-by: Arpad Ryszka arpad.ryszka@gmail.com

@aryszka aryszka added the wip work in progress label Apr 11, 2019
@lmineiro
Copy link
Contributor

Great that you had the chance to work on this. I'm very excited to see it working. Looks great, just left some questions for my own understanding

skipper.go Outdated Show resolved Hide resolved
filters/lifo/lifo.go Outdated Show resolved Hide resolved
filters/lifo/lifo.go Outdated Show resolved Hide resolved
filters/lifo/lifo.go Outdated Show resolved Hide resolved
@szuecs
Copy link
Member

szuecs commented Apr 12, 2019

@lmineiro I pushed the author to get this PR or an origin branch to work on it next week. The state is an early draft and thanks for all the points I agree with the validation and default suggestion.
I want to first start by load testing how many requests and how long it backend latency needs to be to consume too much memory (lets measure how much it actually is and how much you would need to survive right now.
After working on this PR it should have reduced all of the identified weaknesses in memory usage measured before vs. after.

@szuecs
Copy link
Member

szuecs commented Apr 18, 2019

TODO:

  • add tests
    - [ ] agree on defaults no need, because we do not change them
  • check TODOs in code
  • add user docs
  • add godoc

@szuecs
Copy link
Member

szuecs commented Apr 23, 2019

preparation

  • base load test with 500 req/s from pipeline to backends with 25s latency,
  • skipper-ingress with 500m/500Mi resources and default filter lifo(100,100,"10s")
  • vegeta load test from local laptop via wifi (50,150,..850, 900, 950 1000 req/s)

TLDR

The base load test was not able to increase memory significantly and did not interfere with the vegeta load test. The vegeta load test measured a backend which had no increased latency.

data

skipper-ingress cluster metrics show the base 500req/s and added traffic from vegeta:
image
skipper-ingress cluster metrics show memory usage less than 100Mi
image

application targeted by vegeta, without 25s latency metrics measured by skipper-ingress shows <= 10ms p99:
image

vegeta:

]% for j in 50 150 250                                                                                                                                                                suess-new
do
 echo "# test $j req/s $(date)"
 for i in {1..9}
 do
 echo "GET https://c5.kiac-test.teapot.zalan.do/" | vegeta attack -rate=$j -duration=10s -http2=false -timeout=0 | tee results.bin | vegeta report | grep -E 'Status|Latencies'; done
 echo "# end test $j req/s $(date)"
done
# test 50 req/s Tue Apr 23 10:17:08 CEST 2019
Latencies     [mean, 50, 95, 99, max]  27.509091ms, 26.272664ms, 31.441964ms, 91.452012ms, 148.279446ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  27.759488ms, 27.026486ms, 32.239353ms, 93.862021ms, 257.910469ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  28.652469ms, 27.744911ms, 31.611815ms, 56.046175ms, 115.686755ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  29.349225ms, 28.270068ms, 33.25205ms, 52.283292ms, 114.729115ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  28.088164ms, 26.476439ms, 34.403384ms, 103.085709ms, 143.97103ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  25.351571ms, 23.710749ms, 30.397731ms, 55.274229ms, 128.807129ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  30.377361ms, 28.419999ms, 34.312319ms, 101.832357ms, 255.045514ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  23.873398ms, 22.703581ms, 27.294905ms, 65.168173ms, 136.649426ms
Status Codes  [code:count]             200:500
Latencies     [mean, 50, 95, 99, max]  27.573549ms, 26.80266ms, 33.718836ms, 85.400304ms, 140.457505ms
Status Codes  [code:count]             200:500
# end test 50 req/s Tue Apr 23 10:18:38 CEST 2019
# test 150 req/s Tue Apr 23 10:18:38 CEST 2019
Latencies     [mean, 50, 95, 99, max]  25.864185ms, 24.251495ms, 30.810975ms, 65.568723ms, 120.88863ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  27.24251ms, 27.229943ms, 31.933878ms, 56.992336ms, 121.409478ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  30.57175ms, 28.343617ms, 37.09448ms, 101.064227ms, 125.769233ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  25.62186ms, 24.170859ms, 31.071277ms, 58.043497ms, 112.632533ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  27.958389ms, 27.036935ms, 33.543214ms, 92.391822ms, 260.166823ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  28.391695ms, 27.618897ms, 34.195364ms, 94.346584ms, 243.827596ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  24.928912ms, 23.407896ms, 29.714224ms, 63.557802ms, 130.105278ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  26.667832ms, 24.377749ms, 31.863654ms, 93.093963ms, 149.56386ms
Status Codes  [code:count]             200:1500
Latencies     [mean, 50, 95, 99, max]  27.171723ms, 27.077032ms, 32.496683ms, 57.032251ms, 128.309968ms
Status Codes  [code:count]             200:1500
# end test 150 req/s Tue Apr 23 10:20:08 CEST 2019
# test 250 req/s Tue Apr 23 10:20:08 CEST 2019
Latencies     [mean, 50, 95, 99, max]  27.21514ms, 25.579346ms, 32.461085ms, 100.12004ms, 136.427973ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  27.734187ms, 26.586376ms, 32.558974ms, 97.565095ms, 147.979382ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  27.229302ms, 26.211337ms, 32.095122ms, 98.896886ms, 262.390619ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  28.907639ms, 26.017907ms, 34.210968ms, 145.348842ms, 247.342614ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  26.116918ms, 26.027568ms, 30.637602ms, 54.298686ms, 117.477075ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  26.869161ms, 25.782458ms, 32.268238ms, 89.517039ms, 237.009233ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  38.176562ms, 26.382908ms, 121.795271ms, 216.967606ms, 261.552279ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  25.90491ms, 24.765441ms, 30.100033ms, 84.121546ms, 256.309818ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  27.721285ms, 26.760609ms, 32.505367ms, 95.326504ms, 135.102848ms
Status Codes  [code:count]             200:2500
# end test 250 req/s Tue Apr 23 10:21:39 CEST 2019
[sszuecs@sandor-lab:~]% for j in 350 450 550                                                                                                                                                               suess-new
do
 echo "# test $j req/s $(date)"
 for i in {1..9}
 do
 echo "GET https://c5.kiac-test.teapot.zalan.do/" | vegeta attack -rate=$j -duration=10s -http2=false -timeout=0 | tee results.bin | vegeta report | grep -E 'Status|Latencies'; done
 echo "# end test $j req/s $(date)"
done
# test 350 req/s Tue Apr 23 10:26:27 CEST 2019
Latencies     [mean, 50, 95, 99, max]  26.852834ms, 24.907003ms, 31.399264ms, 110.4782ms, 206.443678ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  27.807203ms, 26.574109ms, 32.085095ms, 102.611549ms, 253.734828ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  27.306385ms, 26.286823ms, 31.991289ms, 93.163674ms, 255.950963ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  27.604912ms, 26.777363ms, 32.509305ms, 89.562445ms, 257.760867ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  26.796891ms, 26.195373ms, 31.398931ms, 81.554621ms, 154.188811ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  28.571694ms, 27.297174ms, 32.758353ms, 93.819631ms, 258.457029ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  24.603387ms, 23.064209ms, 29.848087ms, 63.030855ms, 128.461302ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  26.381174ms, 25.416535ms, 31.168549ms, 92.299417ms, 152.524308ms
Status Codes  [code:count]             200:3500
Latencies     [mean, 50, 95, 99, max]  24.804601ms, 24.077631ms, 29.57012ms, 57.910512ms, 118.303047ms
Status Codes  [code:count]             200:3500
# end test 350 req/s Tue Apr 23 10:27:57 CEST 2019
# test 450 req/s Tue Apr 23 10:27:57 CEST 2019
Latencies     [mean, 50, 95, 99, max]  24.680733ms, 23.436636ms, 29.680196ms, 60.229729ms, 127.374931ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  25.094368ms, 24.288992ms, 30.124453ms, 58.652895ms, 130.431566ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  24.828115ms, 23.369978ms, 30.37594ms, 67.077732ms, 137.944325ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  26.382262ms, 25.578487ms, 31.098096ms, 89.223173ms, 253.253455ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  26.164908ms, 25.474195ms, 31.619071ms, 74.305409ms, 238.097052ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  26.787759ms, 26.110796ms, 31.646366ms, 75.385009ms, 158.794251ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  26.58777ms, 25.653686ms, 31.43891ms, 85.006991ms, 152.688812ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  27.838947ms, 26.979718ms, 32.227845ms, 89.290933ms, 263.38933ms
Status Codes  [code:count]             200:4500
Latencies     [mean, 50, 95, 99, max]  27.524476ms, 26.953544ms, 32.451621ms, 75.293556ms, 236.434414ms
Status Codes  [code:count]             200:4500
# end test 450 req/s Tue Apr 23 10:29:28 CEST 2019
# test 550 req/s Tue Apr 23 10:29:28 CEST 2019
Latencies     [mean, 50, 95, 99, max]  27.51054ms, 26.407513ms, 32.188228ms, 101.602357ms, 172.392063ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  28.451865ms, 27.48191ms, 33.147651ms, 83.121951ms, 150.910952ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  28.17025ms, 26.823463ms, 32.760974ms, 103.779093ms, 250.883012ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  27.108292ms, 25.597932ms, 33.315119ms, 100.190616ms, 178.507611ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  27.13066ms, 25.68871ms, 34.469148ms, 99.157085ms, 160.574169ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  27.241566ms, 26.160575ms, 31.905952ms, 93.961ms, 256.260043ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  27.34202ms, 26.172733ms, 32.375855ms, 98.463065ms, 170.519149ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  26.957042ms, 26.236911ms, 32.211521ms, 69.851356ms, 145.126076ms
Status Codes  [code:count]             200:5500
Latencies     [mean, 50, 95, 99, max]  26.333854ms, 25.829944ms, 30.892166ms, 65.230291ms, 144.426863ms
Status Codes  [code:count]             200:5500
# end test 550 req/s Tue Apr 23 10:30:58 CEST 2019
[sszuecs@sandor-lab:~]% for j in 650 750 850                                                                                                                                                               suess-new
do
 echo "# test $j req/s $(date)"
 for i in {1..9}
 do
 echo "GET https://c5.kiac-test.teapot.zalan.do/" | vegeta attack -rate=$j -duration=10s -http2=false -timeout=0 | tee results.bin | vegeta report | grep -E 'Status|Latencies'; done
 echo "# end test $j req/s $(date)"
done
# test 650 req/s Tue Apr 23 10:34:09 CEST 2019
Latencies     [mean, 50, 95, 99, max]  27.086589ms, 25.980388ms, 32.293826ms, 95.640732ms, 242.925309ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  28.82223ms, 26.225478ms, 33.781763ms, 165.415021ms, 265.293064ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  26.454516ms, 25.082523ms, 31.029502ms, 103.360059ms, 176.367071ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  27.238592ms, 26.329075ms, 31.392592ms, 96.484244ms, 242.718299ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  27.141026ms, 25.789433ms, 31.024515ms, 118.921627ms, 182.227019ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  28.042881ms, 26.152982ms, 35.744408ms, 107.56245ms, 256.858059ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  27.540809ms, 26.173909ms, 32.148865ms, 99.852474ms, 151.356092ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  27.201816ms, 25.990763ms, 31.334924ms, 114.234163ms, 179.401514ms
Status Codes  [code:count]             200:6500
Latencies     [mean, 50, 95, 99, max]  26.530544ms, 25.794299ms, 31.159333ms, 83.575521ms, 251.995342ms
Status Codes  [code:count]             200:6500
# end test 650 req/s Tue Apr 23 10:35:39 CEST 2019
# test 750 req/s Tue Apr 23 10:35:39 CEST 2019
Latencies     [mean, 50, 95, 99, max]  30.146361ms, 25.644698ms, 35.082501ms, 203.307196ms, 308.760753ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  28.845606ms, 26.570668ms, 34.998648ms, 145.439798ms, 206.689154ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  139.767804ms, 27.759386ms, 663.485101ms, 1.13827566s, 1.702939297s
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  31.23651ms, 26.360344ms, 41.577832ms, 204.73974ms, 281.173844ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  41.269255ms, 27.221529ms, 131.349805ms, 205.36654ms, 257.84477ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  26.618064ms, 24.981537ms, 31.810609ms, 118.745143ms, 194.832985ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  26.078685ms, 25.09607ms, 30.562253ms, 95.36462ms, 189.136986ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  26.165586ms, 25.43503ms, 31.201036ms, 85.454197ms, 143.440566ms
Status Codes  [code:count]             200:7500
Latencies     [mean, 50, 95, 99, max]  115.246956ms, 26.578968ms, 678.258875ms, 887.506199ms, 1.294788963s
Status Codes  [code:count]             200:7500
# end test 750 req/s Tue Apr 23 10:37:10 CEST 2019
# test 850 req/s Tue Apr 23 10:37:10 CEST 2019
Latencies     [mean, 50, 95, 99, max]  28.095462ms, 25.327102ms, 35.376016ms, 151.45994ms, 217.184514ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  28.399938ms, 24.338464ms, 32.945124ms, 184.820463ms, 275.565829ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  27.50693ms, 25.352411ms, 31.909018ms, 143.174872ms, 221.579238ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  29.026824ms, 25.523441ms, 34.208551ms, 185.16517ms, 292.28411ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  27.141196ms, 25.356627ms, 32.977195ms, 120.13355ms, 188.298489ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  145.351582ms, 26.948468ms, 1.038023413s, 1.270265625s, 1.595414656s
Status Codes  [code:count]             200:7983  0:513  502:4
Latencies     [mean, 50, 95, 99, max]  32.042352ms, 25.68103ms, 36.618391ms, 265.579619ms, 394.374198ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  27.437354ms, 25.524273ms, 32.477799ms, 128.516344ms, 202.031204ms
Status Codes  [code:count]             200:8500
Latencies     [mean, 50, 95, 99, max]  26.312567ms, 25.637989ms, 30.933869ms, 81.745789ms, 233.703611ms
Status Codes  [code:count]             200:8500
# end test 850 req/s Tue Apr 23 10:38:40 CEST 2019
[sszuecs@sandor-lab:~]% for j in 900 950 1000                                                                                                                                                              suess-new
do
 echo "# test $j req/s $(date)"
 for i in {1..9}
 do
 echo "GET https://c5.kiac-test.teapot.zalan.do/" | vegeta attack -rate=$j -duration=10s -http2=false -timeout=0 | tee results.bin | vegeta report | grep -E 'Status|Latencies'; done
 echo "# end test $j req/s $(date)"
done
# test 900 req/s Tue Apr 23 10:43:11 CEST 2019
Latencies     [mean, 50, 95, 99, max]  28.182448ms, 25.501285ms, 35.828855ms, 150.396536ms, 236.802957ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  29.633283ms, 26.152959ms, 36.548131ms, 166.958578ms, 261.126324ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  26.644797ms, 24.146797ms, 32.853901ms, 128.752304ms, 198.507241ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  27.684091ms, 25.462631ms, 34.350871ms, 135.926743ms, 221.628575ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  30.619917ms, 25.96195ms, 38.125829ms, 212.608901ms, 309.509564ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  90.986372ms, 26.956556ms, 644.795129ms, 901.28893ms, 1.106722877s
Status Codes  [code:count]             200:8916  0:84
Latencies     [mean, 50, 95, 99, max]  29.902162ms, 26.826961ms, 34.563665ms, 159.453195ms, 264.734036ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  30.446174ms, 26.62884ms, 35.845388ms, 181.80153ms, 269.155365ms
Status Codes  [code:count]             200:9000
Latencies     [mean, 50, 95, 99, max]  31.818586ms, 26.464946ms, 42.203873ms, 227.908965ms, 326.206807ms
Status Codes  [code:count]             200:9000
# end test 900 req/s Tue Apr 23 10:44:41 CEST 2019
# test 950 req/s Tue Apr 23 10:44:41 CEST 2019
Latencies     [mean, 50, 95, 99, max]  31.896032ms, 26.553353ms, 43.750285ms, 204.089814ms, 312.587368ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  32.033819ms, 26.201264ms, 41.067517ms, 234.09885ms, 318.376389ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  71.045982ms, 26.537006ms, 461.735312ms, 671.412983ms, 879.738777ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  30.040151ms, 25.949099ms, 38.76952ms, 186.662858ms, 281.434643ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  30.330516ms, 26.191942ms, 36.842751ms, 196.695829ms, 305.570798ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  33.026528ms, 26.401044ms, 46.108379ms, 271.775564ms, 347.816895ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  29.807571ms, 24.96662ms, 36.307883ms, 216.853692ms, 298.556127ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  29.745279ms, 26.490637ms, 36.882833ms, 168.951207ms, 261.977956ms
Status Codes  [code:count]             200:9500
Latencies     [mean, 50, 95, 99, max]  30.554505ms, 25.885121ms, 40.672364ms, 206.081846ms, 291.273995ms
Status Codes  [code:count]             200:9500
# end test 950 req/s Tue Apr 23 10:46:12 CEST 2019
# test 1000 req/s Tue Apr 23 10:46:12 CEST 2019
Latencies     [mean, 50, 95, 99, max]  28.303201ms, 25.50111ms, 34.877417ms, 156.54403ms, 248.87644ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  41.25789ms, 27.115222ms, 136.448558ms, 202.093569ms, 259.237536ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  69.806356ms, 27.190468ms, 438.259011ms, 573.966793ms, 728.954981ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  30.729778ms, 26.472996ms, 47.175715ms, 181.039484ms, 266.517286ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  29.430659ms, 26.150193ms, 36.03364ms, 163.766877ms, 252.495039ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  31.438168ms, 25.686724ms, 41.223582ms, 223.215994ms, 315.236256ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  30.370666ms, 26.308648ms, 36.446953ms, 197.169314ms, 301.86511ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  33.907112ms, 26.780551ms, 58.768333ms, 255.972373ms, 316.570642ms
Status Codes  [code:count]             200:10000
Latencies     [mean, 50, 95, 99, max]  28.213983ms, 24.734725ms, 33.72431ms, 160.360832ms, 242.09359ms
Status Codes  [code:count]             200:10000
# end test 1000 req/s Tue Apr 23 10:47:42 CEST 2019

filters/scheduler/lifo.go Outdated Show resolved Hide resolved
filters/scheduler/lifo.go Outdated Show resolved Hide resolved
filters/scheduler/lifo.go Outdated Show resolved Hide resolved
@szuecs szuecs added enhancement ready-for-review and removed wip work in progress labels Apr 26, 2019
docs/operation/operation.md Outdated Show resolved Hide resolved
aryszka and others added 8 commits May 2, 2019 11:10
Signed-off-by: Arpad Ryszka <arpad.ryszka@gmail.com>
Signed-off-by: Arpad Ryszka <arpad.ryszka@gmail.com>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
-
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
szuecs added 16 commits May 2, 2019 11:10
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…ame, which we need for having a filter that can be used as default per route

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…from -default-filters-prepend and -default-filters-append

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…nfig

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
…rors

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@szuecs
Copy link
Member

szuecs commented May 2, 2019

I rebased on master

docs/reference/filters.md Outdated Show resolved Hide resolved
filters/scheduler/doc.go Outdated Show resolved Hide resolved
filters/scheduler/doc.go Outdated Show resolved Hide resolved
filters/scheduler/doc.go Outdated Show resolved Hide resolved
filters/scheduler/doc.go Outdated Show resolved Hide resolved
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@szuecs
Copy link
Member

szuecs commented May 2, 2019

@arjunrn I fixed all comments from your review.

scheduler/scheduler.go Show resolved Hide resolved
s = newStack(c)
r.setStack(key, s)
} else if c != s.config { // UpdateDoc
s.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit doesn't make sense to me. How are the currently pending requests in the old stack handled because calling close() may quit the loop when there are requests which are still running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is true, it only hits on filter config change, which we do not want to do.
Maybe a documentation issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test with some logs, that are not part of the PR (logs in UTC):

[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 101 101 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 101 101 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="diff taken, inserts/updates: 2, deletes: 0"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings, update, route: kube_default__sszuecs_demo_kiac_5__c5_k
iac_: Host(/^c5[.]kiac-test[.]...$/) -> lifo(100, 100, \"8s\") -> <roundRobin, \"http://10.2.10.26:9090\", \"http://10.2.16.82:9090\", \"http://10.2.17.92:9090\", \"http://10.2.20.43:9090\", \"http://10.2.4.252:9090\">"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings, update, route: kubeew_default__sszuecs_demo_kiac_5__c5_kiac_test: Host(/^sszuecs-demo-kiac-5[.]default[.]ingress[.]cluster[.]local$/) -> lifo(100, 100, \"8s\") -> <roundRobin, \"http://10.2.10.26:9090\", \"http://10.2.16.82:9090\", \"http://10.2.17.92:9090\", \"http://10.2.20.43:9090\", \"http://10.2.4.252:9090\">"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings received"
[APP]time="2019-05-03T11:48:47Z" level=info msg="filterRoutes incoming=63 outgoing=63"
[APP]time="2019-05-03T11:48:47Z" level=info msg="set config to: { 100 100 8s}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="set config to: { 100 100 8s}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="route settings applied"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"
[APP]time="2019-05-03T11:48:47Z" level=info msg="lifo filter config: &{0xc000258840 { 100 100 8000000000}}"

and the loadtest that does not show any errors from the client:

# test 250 req/s Fri May  3 13:45:56 CEST 2019
Latencies     [mean, 50, 95, 99, max]  19.840703ms, 17.373569ms, 19.444199ms, 133.260754ms, 228.299662ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  17.769876ms, 17.132646ms, 18.611872ms, 29.94731ms, 108.44734ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  17.977478ms, 17.324209ms, 18.770338ms, 27.872741ms, 240.116582ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  17.828503ms, 17.265959ms, 18.746125ms, 28.884823ms, 104.983894ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  17.803654ms, 17.218953ms, 18.360783ms, 32.098065ms, 106.948886ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  17.717137ms, 17.186123ms, 18.371447ms, 23.453019ms, 102.757632ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  18.45423ms, 17.838747ms, 19.548307ms, 35.793498ms, 102.979901ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  19.991598ms, 18.063765ms, 19.874679ms, 113.721727ms, 199.066109ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  18.695232ms, 18.02269ms, 19.146417ms, 39.549053ms, 109.470087ms
Status Codes  [code:count]             200:2500
# end test 250 req/s Fri May  3 13:47:26 CEST 2019
# test 250 req/s Fri May  3 13:47:26 CEST 2019
Latencies     [mean, 50, 95, 99, max]  18.6704ms, 18.099258ms, 19.659676ms, 30.7674ms, 103.966163ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  19.030097ms, 18.089099ms, 20.179344ms, 52.046475ms, 230.69458ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  18.789092ms, 18.029326ms, 19.522121ms, 48.415269ms, 102.060311ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  18.600886ms, 18.050402ms, 19.039842ms, 26.858217ms, 105.012759ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  18.369626ms, 17.674826ms, 19.609541ms, 39.967571ms, 105.498709ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  19.221207ms, 17.664274ms, 27.470963ms, 48.486916ms, 233.945102ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  20.205234ms, 17.774965ms, 27.417457ms, 94.604069ms, 187.215304ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  19.629634ms, 17.786002ms, 30.27717ms, 38.633115ms, 102.99997ms
Status Codes  [code:count]             200:2500
Latencies     [mean, 50, 95, 99, max]  20.363497ms, 17.770527ms, 28.553207ms, 87.269158ms, 241.359971ms
Status Codes  [code:count]             200:2500
# end test 250 req/s Fri May  3 13:48:56 CEST 2019

filters/scheduler/lifo_test.go Outdated Show resolved Hide resolved
szuecs added 2 commits May 2, 2019 16:13
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>

delegate MoveTo to stack from jobqueue

Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
Signed-off-by: Sandor Szücs <sandor.szuecs@zalando.de>
@szuecs
Copy link
Member

szuecs commented May 3, 2019

👍

1 similar comment
@arjunrn
Copy link
Contributor

arjunrn commented May 3, 2019

👍

@szuecs szuecs merged commit c50a653 into master May 3, 2019
@szuecs szuecs deleted the feature/lifoproxy branch May 3, 2019 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants