-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Etcd watch stream starvation under high read response load when sharing same connection and TLS is enabled #15402
Comments
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Problem definitionThere are two issues we want to address related to etcd endpoint setup:
Watch starvationCaused by running grpc server under http server which is affected by golang/go#58804. For the watch starvation we can:
Running grpc under http serverServing grpc server under http it is an experimental API with different performance and features. To avoid running grpc under http server we can:
Requirements
ProposalImplement changes and execute backports in the order below:
Motivation
BackgroundList of etcd http paths:
EDIT 1: based on @ptabor feedback, I removed |
HI @serathius is it possible to maintain a round-robin scheduler writer? I think it is good for that metrics, which the latency is more predicatable as there is max concurrent streams. The scheduler writer runs without lock. Maybe it could be easier~. |
Prefer to avoid backporting untested code especially if it would be written by me :P. I would wait for grpc experts to implement it and test it. |
LGTM. I would consider also:
|
Discussed @fuweid idea with @mborsz. Implementing it based on Random scheduler should be much simpler than fixing golang/go#58804 which will need to modify priority scheduler. I will try to implement round-robin scheduling. Thanks for suggestion @fuweid |
|
@serathius thanks!!! |
Quick look into implementation of random write scheduler shows that it's isn't so random :P. It prioritizes writing control frames. This also done by priority scheduler. Unfortunately method to check if frame is a control frame is not public https://go.googlesource.com/net/+/master/http2/writesched_random.go#48 (thanks Go). Meaning that custom implementation of scheduler will already be subpar. I can implement scheduler that treats all frames the same, however I'm not expert in http2, so it's hard to guess what it could break. EDIT: Or we can guess what it breaks. https://go.googlesource.com/net/+/04296fa82e83b85317bd93ad50dd00460d6d7940%5E%21/http2/writesched_random.go. Nice to see familiar people. @aojea would love to get your thoughts on this issue. |
IIUIC the problem here is with DATA frames , those are the ones that can cause starvation. Control frames are not carrying much data, they are used for signaling, and per RFC they MUST not be considered for flow-control https://www.rfc-editor.org/rfc/rfc7540
One interesting thing is that if you were using the http2 implementation from golang/x/net instead of the one in the standaard library, is that you. probably were using the Random scheduler until last year https://go-review.googlesource.com/c/net/+/382015 Based on this comment from Brad Fitzpatrick https://go-review.googlesource.com/c/net/+/382015/comments/16b761d4_4c27bf6f and that the problem here are DATA frames, I'm +1 on the Random Scheduler if it demonstrates that it solves the problem |
Returning to the issue as the fix in golang was made available.
I think the last question is: Should/Can we backport the golang.org/x/net bump and change the scheduler on old release. |
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Closing as the fix is now available by default on all supported minor versions. |
Upstream etcd has seen issues with grpc muxed with http and recommends against it. Ref: etcd-io/etcd#15402 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Upstream etcd has seen issues with grpc muxed with http and recommends against it. Ref: etcd-io/etcd#15402 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Upstream etcd has seen issues with grpc muxed with http and recommends against it. Ref: etcd-io/etcd#15402 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Upstream etcd has seen issues with grpc muxed with http and recommends against it. Ref: etcd-io/etcd#15402 Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Upstream etcd has seen issues with grpc muxed with http and recommends against it. Ref: etcd-io/etcd#15402 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> Signed-off-by: zhiyuan <guoshilei.gsl@antgroup.com>
Fixes performance issue under load, ref: etcd-io/etcd#15402 and kubernetes/kubernetes#118460 Signed-off-by: Brad Davidson <brad.davidson@rancher.com> (cherry picked from commit 8c73fd6) Signed-off-by: Brad Davidson <brad.davidson@rancher.com> porting by Signed-off-by: Deshi Xiao <xiaods@gmail.com>
What happened?
When etcd client is generating high read response load, it can result in watch response stream in the same connection being starved. For example a client with an open watch and running 10 concurrent Range requests each returning around 10MB. The watch might get starved and not get any response for tens of seconds.
Problem does not occur when TLS is not enabled nor when watch is created on separate client/connection.
This affects also K8s (any version) as Kubernetes has one client per resource. Problem will trigger when single K8s resource has a lot of data and there are 10+ concurrent LIST requests for the same resource send to apiserver. For example 10k pods. This can cause serious correctness issues in K8s, like controllers not doing any job as they depend on watch to get updates. For example scheduler not scheduling any pods.
We tested and confirmed that all v3.4+ versions are affected.
Issue affects any watch response type:
What did you expect to happen?
Watch stream should never get blocked no matter the load.
How can we reproduce it (as minimally and precisely as possible)?
Repro for progress notify:
Start etcd with TLS and progress notify, for example:
insert a lot of data into etcd, for example run command below couple of times.
Run program below
Instead of getting logs like:
We get logs like:
And next notify response never comes.
Alternatively run ready e2e test:
5c26ded
Anything else we need to know?
With help of @mborsz I managed to find the root cause:
I have a fix in work in release-3.4...serathius:etcd:fix-watch-starving
Will send PR when ready
Etcd version (please run commands below)
All etcd versions
Etcd configuration (command line flags or environment variables)
Any config with TLS enabled
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
N/A
Relevant log output
No response
The text was updated successfully, but these errors were encountered: