-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds API to GEP-91: Client Certificate Validation #2273
Conversation
highlighting the key decisions that need to be made here
|
IMHO, the CA cert only contains the public key part so a ConfigMap is safe enough to store it. Sometimes, people store the CA cert in the Secret because they also store the private key in a different entry of the same object. |
I think putting the CA cert in a separate object would be a better practice (so only the CA cert will be referenced). If I have to choose one from Secret and ConfigMap, I will vote for ConfigMap. |
It's conceivable that we may want to add the ability to use a CA cert from an issues TLS Secret (using the not-quite-standard Additionally, this change should move this GEP to |
geps/gep-91.md
Outdated
caCertificateRefs: | ||
- kind: ConfigMap | ||
group: "" | ||
name: foo-example-com-ca-cert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Istio users would like to be able to point this to a Secret (instead of ConfigMap), specifically the same secret that certificateRefs
point to, only with an additional ca.crt
key containing the client's CA cert.
@howardjohn WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, secret reference is common practice in CNCF landscape, pioneered by cert-manager. Not in the core tls
SEcret type, but its de-facto standard.
I don't mind config map as well of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is still live?
In istio I can do that with:
tls:
mode: Terminate
certificateRefs:
- name: httpbin-credential
options:
gateway.istio.io/tls-terminate-mode: MUTUAL
I try to setup gateway api with cilium (1.15.5) without istio and other sidecars, but MTLS not working
tls:
certificateRefs:
- kind: Secret
name: demo-cert
frontendValidation:
caCertificateRefs:
- kind: Secret
group: ""
name: demo-cert
I think this also |
@robscott @youngnick @shaneutt any edits I can make to this PR to help it move forward ? |
geps/gep-91.md
Outdated
// ClientValidationContext holds configuration that can be used to validate the client intiating the TLS connection | ||
// to the Gateway. | ||
// By default, no client specific configuration is validated. | ||
type ClientValidationContext struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here the ClientValidationContext
should claim some unmentioned behaviors:
- What if the
caCertificateRefs
field is empty? It's implementation-specific, or all the clients will be rejected, or passed? - What's the verify depth of the client certificate? Only the server certificate and the intermediate CA, or the whole chain (Maybe we can add another field like
VerifyDepth
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great questions - my thoughts:
- We could copy the pattern we used for BackendTLSPolicy here, and have something that suggests whether or not to use the system CAs.
- I think that unless we have a very good reason, we should start with mandating the behavior that the full chain is checked, and then we can add configurability later if we need it. We should make the default case as secure as possible, and not have easily-accessible knobs that make things less secure unless we're sure we need them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youngnick Fair enough, but we should clarify them in the GEP to reduce the ambiguity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've add a doc string, mentioning that this is implementation specific today, because im seeing different default verifyDepth settings in different controller
I think the question for both this and #2113 is do we have a preferred default kind for certs, and if so, should it be ConfigMap or Secret? If Secret, maybe both GEPs should be using SecretObjectReference? |
For reference, we temporarily chose ConfigMapObjectReference because the usage didn't require tls.key and tls.crt. #2113 (comment) mentioned the use of |
geps/gep-91.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This GEP proposes a way to validate the TLS certificate presented by the downstream client to the server
(Gateway Listener in this case) during a [TLS Handshake Protocol][], also commonly referred to as mutual TLS (mTLS).
Could you elaborate? I'm not sure everyone would conclude that this is commonly referred to as mutual TLS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you could also elaborate on what you mean by client and server in this proposal, that would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@candita can you clarify why you think the client and server here needs elaboration. The server is a Gateway listener (it says that) and the client is then whoever is making requests to the Gateway. Is there something more subtle that you're looking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue here is that, while this is definitely a mutual handshake, calling something "mutual TLS" can be taken to mean a very specific thing (dynamically-issued, frequently-rotated certificates mutually authenticated between services as a way to handle service-to-service authentication). This is definitely not that thing, but one thing that @candita and I have learned through much longer than we would have liked experience is that you really shouldn't use "mutual TLS" without some explanation of what sense you mean it in, or people will believe it's the wrong one.
Yes, I know that "mutual TLS" in this case is technically correct, but it also can be misleading if, for example, a person is used to reading that phrase in a mesh context, not in a reverse-proxy-or-Ingress context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the controversial mTLS term from the description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also remove the ambiguous phrase "downstream client", and replace it with something that clearly indicates what resource/s are connecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think the term "downstream client" is ambiguous, can you share any other alternatives ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term 'mTLS' is far older than meshes - and I don't think any recent implementation should 'take over' a well established term. Agree with 'upstream/downstream' - client and server are usually sufficient.
My suggestion is to start with the survey of existing configs - Apache httpd, Nginx, servlets, vendor specific - and use the terms and keywords that are most common among them. There is no reason for Gateway API to invent new words or terms, and easier for users to have terms they are already using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you costin, if you peek into the prior art section, most implementations call this mTLS, but since this is controversial, I've tried to stick with RFC terms as much as possible
geps/gep-91.md
Outdated
// References to a resource in different namespace are invalid UNLESS there | ||
// is a ReferenceGrant in the target namespace that allows the certificate | ||
// to be attached. If a ReferenceGrant does not allow this reference, the | ||
// "ResolvedRefs" condition MUST be set to False for this listener with the | ||
// "RefNotPermitted" reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add docs on the behavior when the CACertificateRefs is empty but needed? Echoing https://github.com/kubernetes-sigs/gateway-api/pull/2273/files#r1334951299 to some extent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see an update for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not sure I understand this, caCaertificateRefs
is an optional field within clientValidation
, if its empty it is assumed that is the intent and it is valid, because in the future clientValidation
may have more fields within it such as san
geps/gep-91.md
Outdated
to the Gateway | ||
* Introduce a `caCerficateRefs` field within `clientValidation` that can be used to specify a list of CA Certificates that | ||
can be used as a trusted anchor to validate the certificates presented by the client | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GatewayTLSConfig
already has a CertificateRefs
field. Why not just add the CACertificateRefs
field there?
gateway-api/apis/v1beta1/gateway_types.go
Line 374 in 800024e
CertificateRefs []SecretObjectReference `json:"certificateRefs,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because there are multiple attributes that can be used to validate the client, and this GEP starts off with CACertRef
here is an exhaustive list supported in Envoy https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/transport_sockets/tls/v3/common.proto#envoy-v3-api-msg-extensions-transport-sockets-tls-v3-certificatevalidationcontext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, I don't quite understand how this GEP ends up so brief compared to the GEP for BackendTLSPolicy. There seems to be so much missing in terms of what you'd you need for working client certificate validation.
It would probably make sense to add at least:
- why we need this
- a description of use cases for mesh and non-mesh
- a section on how the request flow changes, using the GAMMA request flow in https://gateway-api.sigs.k8s.io/concepts/gamma/?h=request+flow#request-flow and/or the original request flow in https://gateway-api.sigs.k8s.io/concepts/api-overview/?h=request+flow#request-flow
- something that describes how the client certificate validation interacts with edge termination and backend TLS termination
- a section on prior art, e.g. how this is implemented already in a few implementations.
@youngnick do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@candita yes, I think this GEP would be well served by adding some things:
- why we need this - "It's table stakes and widely supported", worded better than that is probably okay, but it should be recorded in the GEP. And the "widely supported" part would at least need a table listing some proxy implementations and their support.
- I think this also needs a "Deferred" section that covers, at a minimum, "How do I use the system certs?". I think that the custom CA use case will be far more common, but I could be wrong.
- I agree that this GEP should specify what happens when you don't have other TLS details (it doesn't work or gets rejected), and what happens when BackendTLSPolicy is enabled (the client certificate will be used for the Listener connection only, and not reused for the backend connection). This last item - reusing the Listener client cert on a backend connection - feels like something folks may ask for, but we should definitely have out of scope could be another one for the "Deferred" section.
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
sorry for the force push 🙈 |
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arkodg, frankbu, robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I don't think release notes are needed for GEP changes, can save them until the change that actually adds the API spec. /release-note-none |
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @arkodg!
/lgtm
@@ -1,22 +1,164 @@ | |||
# GEP-91: Client Certificate Validation for TLS terminating at the Gateway Listener | |||
|
|||
* Issue: [#91](https://github.com/kubernetes-sigs/gateway-api/issues/91) | |||
* Status: Provisional | |||
* Status: Implementable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this formally moves to experimental, I think we need a corresponding metadata file for this GEP. Not going to block on this specific PR since it can easily be handled in a following.
/hold cancel |
What type of PR is this?
/kind gep
What this PR does / why we need it:
What
Defines an API in GEP-91 to support Client Certificate Validation for TLS Terminating at the Gateway
using a CA Certificate as a trusted anchor.
Why
Allows the platform admin configuring the Gateway resource to only select trusted clients to connect to the Gateway
Relates to #91 & #2110