-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mde: set additional remote port #2917
Conversation
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@PiotrSikora
Unless we can generate a lookup table(I would do that later), For the cluster, it is already generated by pilot anyway, now I only need to add the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this doable in wasm ABI? we're going to adapt it but was not prioritized yet.
@kyessenov I am not sure. I don't have the very clear view of wasm ABI. It seems to me this PR doesn't use corner hook. |
I don't think it's right to rely on metadata exchange for networking purposes. It introduces an unnecessary coupling between telemetry and basic routing. Telemetry continues to function without metadata exchange in a degraded state but networking will fail hard. Sounds like a layering violation. |
Thank you for the comment @kyessenov ! I think about it and let me explain. The idea is that networking need to deliver the information between envoy cluster in egress sidecar and envoy listener in ingress sidecar. It's pilot's responsibility to set the MDE filter with ALPN config as istio. |
@kyessenov When I add a MDE filter in certain cluster and listener while telemetry add the filter through EnvoyFilter api again, would it cause issue?
|
I imagine there would be an issue with the two instances of this filter as-is. It writes into filter state for a well known key, so they would be overriding each other. |
Double write on the same value is not optimal, is it a hard failure? The background is that the double MDE should be rare, or ideally none if we carefully provide the envoy filter. Meanwhile I am look into the alternative method to resolve the networking route. If this update to MDE is not hard failure we can land it first |
Sorry, I need more context to approve. I'm confused by this effort since "BTS" was supposed to remove TCP tunneling of metadata, yet this looks like it'll depend on TCP tunneling of metadata. Why are we doing BTS then if we still use TCP level data instead of pure HTTP? |
I have two milestones for BTS:
In all TCP metadata exchange cannot be deprecated until next release cycle. @kyessenov Do I answered your questions? |
I'm not sure I follow. Regarding point 1, again, why does HTTP tunneling depend on telemetry TCP protocol? TCP telemetry should not be mandatory for networking as it has its drawbacks. It's entirely possible we don't have any telemetry extensions enabled whatsoever if they cause perf issues. |
No, http tunneling doesn't depend on telemetry TCP protocol. The tunneling only depends on the original endpoint port. The endpoint port is lost when BTS redirect traffic to dedicated port 15008. It should have very little perf issue because the connection in BTS is long live connection. The connection is reused in BTS http tunnel and future tcp tunnel |
@lambdai Has this been discussed with P&T group? I think this still means that TCP metadata exchange becomes mandatory for BTS, since HTTP authz policies would simply not work without it. |
No... BTS http need the metadata anyway, achieve could be an another filter (proxy protocol filter), or a filter build from scratch, or duplicates half of the code of MDE(since I need 1 way information deliver). I am also experimenting deliver the port through http headers. The fallback is that dest port L4 authz policy will be ignored in experimental BTS phase. To me MDE is a no brain winner before your concern on perf |
@kyessenov Again, why do you think there is performance issue when MDE is enforced on the connection on http traffic? One of the goal of BTS is to maintain long live connection between sidecars so that either handshake and L4 MDE cost is |
@lambdai there's a cost to telemetry in terms of CPU and memory consumption. We're recommending users to turn off metadata exchange if the cost is not worth the value that telemetry brings (there're always basic envoy stats available). I think we need to discuss whether telemetry metadata exchange should become mandatory even without the telemetry enabled. |
Chatted with @PiotrSikora, we can also introduce proxy protocol transport socket to achieve the goal. |
Does PROXY require some ALPN indicator as well? |
No. Why are you asking? Let me guess, are you worry if the peer doesn't recognize proxy protocol? |
@lambdai No just want to avoid another explosion of ALPN protocols. |
Good point! |
Close for now. We can add the original port magic header before "host is selected", at the cost of "assuming all the endpoint in the service has the same target port". This is true in k8s service but not in other service registries. Will update BTS design to reflect it. |
Signed-off-by: Yuchen Dai silentdai@gmail.com
Add a functionality to propagate downstream socket port to upstream.
This will be used by BTS for http traffic. BTS ingress listener is always listening on port 15008.
The egress listener need the original workload port(e.g 80) to select the route.
This is mainly to address the use case that
Client send request to svc:80 with :authority svc:80 but VS rewrite :authority to svc:8080 but not L4 address is remained at svc:80.
Before BTS, the ingress listener could recover the port 80. Now this information but be encoded through egress sidecar and ingresssidecar.
This PR add the prototype to deliver the dest port through metadata_exchange plugin, however, it is not done yet
since it's 15008 port is delivered instead of the 80
The follow up PR will be focusing on egress sidecar to figure out the port 80 despite the endpoint port is 15008 (though endpoint metadata)
part of istio/istio#24540