RFC: Full mTLS for Diego Container-to-Container Traffic#1437
RFC: Full mTLS for Diego Container-to-Container Traffic#1437
Conversation
Add draft RFC proposing implementation of full mutual TLS (mTLS) for container-to-container traffic in Diego. The proposal introduces a new HTTP-based listener on port 62443 alongside the existing TCP-based port (61443), providing a dual opt-in model for operators and app authors. Key features: - Phase 1: Server-side HTTP-based C2C mTLS with XFCC header forwarding - Phase 2: Client-side egress proxy for automatic cert injection - Phase 3: Full zero-trust app-to-app communication integration Maintains backwards compatibility with existing deployments.
Use HTML <br/> tags instead of \n for line breaks in mermaid diagram labels to ensure proper rendering on GitHub and in markdown viewers.
| **BOSH Properties**: | ||
|
|
||
| ```yaml | ||
| containers.proxy.enable_egress_proxy: |
There was a problem hiding this comment.
Hey @rkoster, thx a lot for raising this RFC! I was wondering if the egress-part could also be enabled independent of the c2c-parts in this RFC. So, when only enabling enable_egress_proxy and an app-developer adding the Env CF_INSTANCE_MTLS_PROXY=http://127.0.0.1:61445 without the properties from step 1 and 3, would the egress proxy add the instance-cert in the mTLS connection towards gorouter (or a TLS-terminating component in front of gorouter)?
In our setup, we do not enable c2c as network.write is not exposed to the app developers. Yet, there is demand for platform features to support authentication from AppA to AppB.
We might consider implementing a per-route feature as per rfc-0027 in gorouter to even parse the client-cert data on platform-side, without additional logic within each app.
| - The receiving app gets an `X-Forwarded-Client-Cert` header with caller identity: | ||
|
|
||
| ``` | ||
| X-Forwarded-Client-Cert: Hash=abc123;Subject="CN=instance-guid,OU=app:client-app-guid,OU=space:space-guid" |
There was a problem hiding this comment.
The haproxy-boshrelease and gorouter (not sure where exactly the logic for that is) already include code to handle XFCC headers, we should consider how these should interact. Having all use the same header names but using different formats seems like a bad idea (though that may already be the case for HAProxy and gorouter). My first thought was to add another header that is set by envoy to indicate that this is app2app traffic, like cf-app-to-app: 1. This is then used to select the right context in which to evaluate the provided metadata. For the values themselves I feel like using the same format would be preferable, there's also quite some history to the format we currently use in HAProxy as we had to go through incompatible changes when we learned that certificates allow all sorts of special characters which HTTP headers don't.
There was a problem hiding this comment.
This is not a generic mtls feature. The scope is only for CF generated Identity certs to be used. So the special characters should not be an issue. When traffic comes in via the static mtls envoy port it must be from an other app using an app instance identity cert.
There was a problem hiding this comment.
Hi @rkoster, I really like this idea. I only wonder if we implement this, how can we do another iteration to support bring your own certs for apps for c2c networking, where the developers specify ca, cert and key for both communicating parties. Have you maybe done some thoughts about that?
There was a problem hiding this comment.
When traffic comes in via the static mtls envoy port it must be from an other app using an app instance identity cert.
Right, but an app might receive outside mTLS traffic as well as app2app mTLS traffic. Depending on where traffic comes from different rules apply on what constitutes an authorized request, doesn't it?
There was a problem hiding this comment.
yes, so when traffic comes from external to CF it will follow the existing route integrity path and get the header set by gorouter. So different rules based on different ports for internal vs external mtls traffic.
There was a problem hiding this comment.
@chombium why would app developers want to bring their own certificates instead of relying on the already provided certs by the platform. I don't really understand the usecase for this.
There was a problem hiding this comment.
Also please take a look at: #1438, which was created based on the feedback on this PR.
|
Based on the feedback on this PR I have created an other RFC focused on app 2 app mtls using the gorouter: #1438 |
|
|
||
| - **SNI Handling**: Envoy needs to extract the target hostname for proper TLS handshake | ||
| - **NO_PROXY**: Applications should configure `NO_PROXY` for traffic that should not go through the proxy | ||
| - **Non-HTTP Traffic**: The HTTP CONNECT-based egress proxy only supports HTTP/HTTPS traffic. Support for TCP-based protocols could be addressed in a follow-up RFC. |
There was a problem hiding this comment.
I had a hard time wrapping my head around all the different permutations and did some research, if I got something wrong please correct me. The following variations exist:
HTTP_PROXY=http://localhost:61445HTTP_PROXY=https://localhost:61445HTTPS_PROXY=http://localhost:61445HTTPS_PROXY=https://localhost:61445
The HTTP / HTTPS in the *_PROXY variable tells the client when to use which, when making http:// requests it'll use HTTP_PROXY and https:// uses HTTPS_PROXY. The second variation is the protocol to talk to the proxy which is set in the value of the env var. This only controls whether the client speaks plain TCP or puts TLS on top for the connection to the proxy.
Now, while these two env vars look the same they trigger completely different behavior. When the client wants to talk http:// it sends the request to the proxy with just one minor adjustment: the target URI is sent in absolute-form meaning it includes scheme and host instead of just the path. For https:// this is not the case, the client will issue a CONNECT request to the proxy to establish a TCP tunnel to the target and then perform the TLS handshake on top of that. This requires the client to handle all the (m)TLS which is not what we want.
So to recap:
- We should only set
HTTP_PROXY, https traffic will always require the client to take part in the TLS handshake. - A client needs to deliberately speak http for this scenario to select the proxy and have it upgrade the connection to mTLS.
- The setup we want is not a
HTTP CONNECT-based egress proxy, it's a ...HTTP-based egress proxy? The wording is all over the place.
By not setting HTTPS_PROXY the risk of mangling some internet traffic because the user forgot to set NO_PROXY is also reduced as https traffic will never pass through the proxy.
Summary
This RFC proposes implementing full mutual TLS (mTLS) for container-to-container (C2C) traffic in Diego, enabling applications to both authenticate themselves and verify the identity of connecting applications.
View the full RFC
The approach introduces:
Key Points
X-Forwarded-Client-CertheaderImplementation Phases
cc @cloudfoundry/toc @cloudfoundry/wg-app-runtime-platform