-
Notifications
You must be signed in to change notification settings - Fork 679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing connection reset on reconfiguration with wildcard domain #6008
Comments
Note: The issue seems to appear with |
Contour should be detecting mismatched SNI and Host headers (and send 421 responses), we use some custom Lua to configure Envoy to catch this, see: contour/internal/envoy/v3/listener.go Lines 728 to 777 in 5bb85eb
Mentioning this since it is the main focus of the linked Envoy issue |
@sunjayBhatia Can I support you in debugging this issue? I definitely get responses from the wrong backend after an initial connection. I even still receive responses from the wrong backend while I get responses from the correct backend in an incognito window. I am not sure what you mean with mismatched SNI and Host headers. I guess there will be never a mismatch, as the wildcard domain is also valid for the specific domain, right? The issue is that the target backend changes when a new specific ingress is configured. So as soon as the specific ingress is configured, the old wildcard backend becomes invalid and the new specific backend becomes the only valid backend. |
(I only mentioned the mismatched SNI and Host feature to make sure that was covered for posterity, since the linked Envoy issue talks about that case heavily) The issue here generally is due to connection reuse/coalescing from the browser, I can reproduce this in chrome on mac with HTTP/2 and HTTP/1.1 (HTTP/2 disabled in contour). If a browser reuses an existing connection for additional requests, Envoy does not "re-triage" further requests on the connection since by design this connection is already tied to a Listener filter chain. In this case that means that subsequent requests on that same connection will be processed by the wildcard By default we have an idle timeout on downstream connections set to 60s so you can observe the requests being routed to the correct backends if you let the connection get timed out. As a workaround as well, you could use the max-requests-per-connection Listener setting https://projectcontour.io/docs/1.27/configuration/#listener-configuration (though this is a global setting so maybe not ideal) |
The structure of our Envoy configuration in this situation is set up as the following hierarchy:
One thing we could do to help with this case is for each specific FQDN, add a domain match and route (that would match the specific host header) to any wildcard routeconfigs that exist (and match the specific route) as a catch all |
cc @projectcontour/maintainers @projectcontour/contour-reviewers for any thoughts here |
e.g. here is an example wildcard routeconfig
in this case, we could duplicate each route on |
This works for us as a workaround. Thank you! Edit: We could further improve performance by disabling |
@sunjayBhatia as expected the workaround is really adding a lot of overhead, especially on many small requests, especially as chrome allows only 6 concurrent connections per origin. So it would still be important for us to get a fix without the need to set |
yep, only allowing one request per connection is definitely not a full solution we'll have to do some work on an appropriate solution to this, my suggestion above on adding routes to the wildcard routeconfig might work, but needs more thinking/testing contributions of course welcome! |
What steps did you take and what happened:
*.example.com
Ingress which routes todefault-app
specific.example.com
in chrome 119.0.6045.199 on linuxspecific.example.com
Ingress which routes tomy-app
specific.example.com
)What did you expect to happen:
After reload I expected the iframe to show the content, served from the
my-app
service.What did actually happen:
The original
default-app
was shown also after several iframe reloads. Opening the page in an incognito tab results in loading the actualmy-app
page.Anything else you would like to add:
It seems like the browser is keeping the connection (probably HTTP 2) open. I would expect envoy to reset the connection if its routing is affected by a new configuration.
The following image shows the dag, when both ingresses are configured (I edited the image, but should show the relevant information)
The following screenshots show the request timings of the two requests (first load and then reload) in chrome.
You see that the first request included additional overhead for
Initial connection
andSSL
. I would expect the same on the second request as the second request should create a new connection for the new underlying service. The previous connection should have been reset.Environment:
kubectl version
): v1.26.3Probably related: envoyproxy/envoy#6767
The text was updated successfully, but these errors were encountered: