-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gateload parity / investigate longer PushToSubscriberInteractive round trips #56
Comments
With these sgload params:
I'm seeing long gateload roundtrip times:
Couchbase bucket stats (post-run): |
TODO:
|
Gateload behavior for 5k / 5k ( 2 accel / 2 sg ) ignoring rampup:
TODO:
|
I'm running into a blocker if I try to do a 5k/5k test:
it's giving me this error:
and I can see in the scoop logs that user creation is taking upwards of 2 mins: |
Seeing view warnings in sync gateway logs:
|
sgload is basically trying to create 10K users in parallel, whereas gateload created them in serial up-front (if I'm remembering correctly) |
You're right - gateload staggers the user creation - and I'm assuming this is the reason why. I guess you can stagger user creation in sgload for now, but we should think about escalating couchbase/sync_gateway#799 post-1.4, even if only to help our own testing. |
Possibile cause: continuous changes feeds vs longpoll changes feeds. Easiest way to vet this would be to run gateload in longpoll mode and see if there are any differences in round trip times. |
In perf run 308, I'm noticing higher memory usage on Sync Gateway than on what is supposed to be an equivalent gateload run: |
I'm running against the perf cluster created in perf run 311, which is using SG commit couchbase/sync_gateway@adedf1a, and seeing the following errors in the Sync Gateway logs:
and the corresponding retry warnings in the sgload logs:
With the current retry settings: https://github.com/hashicorp/go-retryablehttp/blob/master/client.go#L33-L35
and backoff logic: https://github.com/hashicorp/go-retryablehttp/blob/master/client.go#L295-L302
since it's on its 4th retry ( 1s --> 15 seconds total. which would lead to long gateload roundtrip times. Here is a snippet from the sgload logs, which shows some gateload round trip times apparently taking up to 6 minutes 50 seconds: https://gist.github.com/tleyden/e5e6daec49939f8b1cd2e2769d89c156#file-gistfile1-txt-L763-L790 I'm running sgload on the command line with the following settings:
|
I haven't been able to repro the issues in #56 (comment) unfortunately. I'm seeing another issue though that I want to document while it's happening. Steps to repro: The sgload writers all finished, however the readers are basically "stuck" and not getting new docs. The expvars have been stuck at:
for several hours. and when I run an ngrep, I'm not seeing anything except: https://gist.github.com/tleyden/3067b330409c1f91392148d2e6658f54 (and not sure what to make of it) |
Possible problems:
Diagnoses:
|
TODO: Add logkey to Sync Gateway called NOTE: log * doesn't enable it |
I added a 5 minute timeout for all requests in sgload, specifically targeting In perf run 317, I'm seeing lots of sgload errors like:
In the Sync Gateway logs I picked out a random
Based on that since value, I used postman (gui curl client) to send this request:
with credentials The postman UI hung for about 1-2 minutes before I finally cancelled it. Then I changed the request parameters from https://gist.github.com/tleyden/e3267144b619932a8fc8101ba2b6953c I was able to repeat the same thing several times, and it would block indefinitely with the |
I'm also seeing the same error as previous in the sync gateway logs:
|
Focusing in one one of the
and retried that via curl with the credentials for reader-user-1919-testrunid and the same _changes parameter, and got a response: https://gist.github.com/tleyden/bdb353e2760c2e4e426343bc193e755f I looked for previous _changes requests for that user and found:
I tried re-running the first one with https://gist.github.com/tleyden/00b3779669b3a572af2281f170133985 and was surprise not to see I also tried the changes request with no https://gist.github.com/tleyden/cdb5bcb90c5a1a1bb60dd0377aa1ddd3 and was surprised not to see either |
For the stuck changes request mentioned earlier:
with credentials username: I enabled
|
With
|
Note to self: gateload test that shows high pushtosubscriber times http://uberjenkins.sc.couchbase.com/view/Performance/job/sync-gateway-perf-test/328/parameters/ |
It's worth looking into why https://github.com/couchbaselabs/sync-gateway-accel/issues/52 is easily reproducible by gateload but not by sgload |
couchbaselabs/mobile-testkit#205
3sg 2 accel
5k / 5k
The text was updated successfully, but these errors were encountered: