Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds API to GEP-91: Client Certificate Validation #2273

Merged
merged 21 commits into from
Apr 8, 2024

Conversation

arkodg
Copy link
Contributor

@arkodg arkodg commented Aug 7, 2023

What type of PR is this?

/kind gep

What this PR does / why we need it:
What
Defines an API in GEP-91 to support Client Certificate Validation for TLS Terminating at the Gateway
using a CA Certificate as a trusted anchor.

Why
Allows the platform admin configuring the Gateway resource to only select trusted clients to connect to the Gateway

Relates to #91 & #2110

@k8s-ci-robot k8s-ci-robot added kind/gep PRs related to Gateway Enhancement Proposal(GEP) do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 7, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 7, 2023
geps/gep-91.md Outdated Show resolved Hide resolved
@arkodg
Copy link
Contributor Author

arkodg commented Aug 9, 2023

highlighting the key decisions that need to be made here

  • naming -
    • clientValidation to hold client specific information to validate
    • caCertificateRefs - to hold the CA certs
  • Secret vs ConfigMap to hold the CA Cert
  • key name for the CA Cert, ca.crt is what this GEP proposes
  • allowing these resources to exist in a separate namespace or not - GEP proposal allows this using ReferenceGrant similar to what is done for the certificateRefs field

@spacewander
Copy link
Contributor

Secret vs ConfigMap to hold the CA Cert

IMHO, the CA cert only contains the public key part so a ConfigMap is safe enough to store it. Sometimes, people store the CA cert in the Secret because they also store the private key in a different entry of the same object.

@spacewander
Copy link
Contributor

Secret vs ConfigMap to hold the CA Cert

IMHO, the CA cert only contains the public key part so a ConfigMap is safe enough to store it. Sometimes, people store the CA cert in the Secret because they also store the private key in a different entry of the same object.

I think putting the CA cert in a separate object would be a better practice (so only the CA cert will be referenced). If I have to choose one from Secret and ConfigMap, I will vote for ConfigMap.

@youngnick youngnick added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Aug 17, 2023
@k8s-ci-robot k8s-ci-robot removed the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Aug 17, 2023
@youngnick
Copy link
Contributor

It's conceivable that we may want to add the ability to use a CA cert from an issues TLS Secret (using the not-quite-standard ca.crt key in a kubernetes.io/tls type Secret), but that's not a blocker, since we can always add that later.

Additionally, this change should move this GEP to Implementable, I think, and then once the types are present in the Experimental channel, this GEP will move to Experimental.

geps/gep-91.md Outdated
caCertificateRefs:
- kind: ConfigMap
group: ""
name: foo-example-com-ca-cert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Istio users would like to be able to point this to a Secret (instead of ConfigMap), specifically the same secret that certificateRefs point to, only with an additional ca.crt key containing the client's CA cert.

@howardjohn WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, secret reference is common practice in CNCF landscape, pioneered by cert-manager. Not in the core tls SEcret type, but its de-facto standard.

I don't mind config map as well of course.

Copy link

@gawsoftpl gawsoftpl May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is still live?

In istio I can do that with:

    tls:
      mode: Terminate
      certificateRefs:
      - name: httpbin-credential
      options:
        gateway.istio.io/tls-terminate-mode: MUTUAL

I try to setup gateway api with cilium (1.15.5) without istio and other sidecars, but MTLS not working

    tls:
      certificateRefs:
      - kind: Secret
        name: demo-cert
      frontendValidation:
        caCertificateRefs:
        - kind: Secret
          group: ""
          name: demo-cert

@frankbu
Copy link
Contributor

frankbu commented Aug 22, 2023

I think this also
Fixes #2110

@arkodg
Copy link
Contributor Author

arkodg commented Sep 18, 2023

@robscott @youngnick @shaneutt any edits I can make to this PR to help it move forward ?
I did notice that the Backend TLS PR has merged which defined the cert refs as a ConfigMapObjectReference . This PR took the approach of using ObjectReference to give implementations the flexiblity to define the ca.cert in a ConfigMap or as a Secret (within the same TLS Cert used to define server certs). I can go ahead and make the change to this PR to keep the approach identical across the apis.

geps/gep-91.md Outdated
// ClientValidationContext holds configuration that can be used to validate the client intiating the TLS connection
// to the Gateway.
// By default, no client specific configuration is validated.
type ClientValidationContext struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here the ClientValidationContext should claim some unmentioned behaviors:

  1. What if the caCertificateRefs field is empty? It's implementation-specific, or all the clients will be rejected, or passed?
  2. What's the verify depth of the client certificate? Only the server certificate and the intermediate CA, or the whole chain (Maybe we can add another field like VerifyDepth)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great questions - my thoughts:

  1. We could copy the pattern we used for BackendTLSPolicy here, and have something that suggests whether or not to use the system CAs.
  2. I think that unless we have a very good reason, we should start with mandating the behavior that the full chain is checked, and then we can add configurability later if we need it. We should make the default case as secure as possible, and not have easily-accessible knobs that make things less secure unless we're sure we need them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youngnick Fair enough, but we should clarify them in the GEP to reduce the ambiguity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've add a doc string, mentioning that this is implementation specific today, because im seeing different default verifyDepth settings in different controller

@frankbu
Copy link
Contributor

frankbu commented Sep 25, 2023

I did notice that the #2113 has merged which defined the cert refs as a ConfigMapObjectReference

I think the question for both this and #2113 is do we have a preferred default kind for certs, and if so, should it be ConfigMap or Secret? If Secret, maybe both GEPs should be using SecretObjectReference?

@candita
Copy link
Contributor

candita commented Sep 26, 2023

I did notice that the #2113 has merged which defined the cert refs as a ConfigMapObjectReference

I think the question for both this and #2113 is do we have a preferred default kind for certs, and if so, should it be ConfigMap or Secret? If Secret, maybe both GEPs should be using SecretObjectReference?

For reference, we temporarily chose ConfigMapObjectReference because the usage didn't require tls.key and tls.crt. #2113 (comment) mentioned the use of ClusterTrustBundle, which seemed popular.

geps/gep-91.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This GEP proposes a way to validate the TLS certificate presented by the downstream client to the server
(Gateway Listener in this case) during a [TLS Handshake Protocol][], also commonly referred to as mutual TLS (mTLS).

Could you elaborate? I'm not sure everyone would conclude that this is commonly referred to as mutual TLS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you could also elaborate on what you mean by client and server in this proposal, that would be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@candita can you clarify why you think the client and server here needs elaboration. The server is a Gateway listener (it says that) and the client is then whoever is making requests to the Gateway. Is there something more subtle that you're looking for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that, while this is definitely a mutual handshake, calling something "mutual TLS" can be taken to mean a very specific thing (dynamically-issued, frequently-rotated certificates mutually authenticated between services as a way to handle service-to-service authentication). This is definitely not that thing, but one thing that @candita and I have learned through much longer than we would have liked experience is that you really shouldn't use "mutual TLS" without some explanation of what sense you mean it in, or people will believe it's the wrong one.

Yes, I know that "mutual TLS" in this case is technically correct, but it also can be misleading if, for example, a person is used to reading that phrase in a mesh context, not in a reverse-proxy-or-Ingress context.

Copy link
Contributor Author

@arkodg arkodg Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the controversial mTLS term from the description

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also remove the ambiguous phrase "downstream client", and replace it with something that clearly indicates what resource/s are connecting.

Copy link
Contributor Author

@arkodg arkodg Jan 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think the term "downstream client" is ambiguous, can you share any other alternatives ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term 'mTLS' is far older than meshes - and I don't think any recent implementation should 'take over' a well established term. Agree with 'upstream/downstream' - client and server are usually sufficient.

My suggestion is to start with the survey of existing configs - Apache httpd, Nginx, servlets, vendor specific - and use the terms and keywords that are most common among them. There is no reason for Gateway API to invent new words or terms, and easier for users to have terms they are already using.

Copy link
Contributor Author

@arkodg arkodg Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you costin, if you peek into the prior art section, most implementations call this mTLS, but since this is controversial, I've tried to stick with RFC terms as much as possible

geps/gep-91.md Outdated Show resolved Hide resolved
geps/gep-91.md Outdated Show resolved Hide resolved
geps/gep-91.md Outdated Show resolved Hide resolved
geps/gep-91.md Outdated Show resolved Hide resolved
geps/gep-91.md Outdated Show resolved Hide resolved
geps/gep-91.md Outdated
Comment on lines 43 to 47
// References to a resource in different namespace are invalid UNLESS there
// is a ReferenceGrant in the target namespace that allows the certificate
// to be attached. If a ReferenceGrant does not allow this reference, the
// "ResolvedRefs" condition MUST be set to False for this listener with the
// "RefNotPermitted" reason.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add docs on the behavior when the CACertificateRefs is empty but needed? Echoing https://github.com/kubernetes-sigs/gateway-api/pull/2273/files#r1334951299 to some extent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see an update for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure I understand this, caCaertificateRefs is an optional field within clientValidation, if its empty it is assumed that is the intent and it is valid, because in the future clientValidation may have more fields within it such as san

geps/gep-91.md Outdated
to the Gateway
* Introduce a `caCerficateRefs` field within `clientValidation` that can be used to specify a list of CA Certificates that
can be used as a trusted anchor to validate the certificates presented by the client

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GatewayTLSConfig already has a CertificateRefs field. Why not just add the CACertificateRefs field there?

CertificateRefs []SecretObjectReference `json:"certificateRefs,omitempty"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because there are multiple attributes that can be used to validate the client, and this GEP starts off with CACertRef
here is an exhaustive list supported in Envoy https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/transport_sockets/tls/v3/common.proto#envoy-v3-api-msg-extensions-transport-sockets-tls-v3-certificatevalidationcontext

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I don't quite understand how this GEP ends up so brief compared to the GEP for BackendTLSPolicy. There seems to be so much missing in terms of what you'd you need for working client certificate validation.

It would probably make sense to add at least:

@youngnick do you agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@candita yes, I think this GEP would be well served by adding some things:

  • why we need this - "It's table stakes and widely supported", worded better than that is probably okay, but it should be recorded in the GEP. And the "widely supported" part would at least need a table listing some proxy implementations and their support.
  • I think this also needs a "Deferred" section that covers, at a minimum, "How do I use the system certs?". I think that the custom CA use case will be far more common, but I could be wrong.
  • I agree that this GEP should specify what happens when you don't have other TLS details (it doesn't work or gets rejected), and what happens when BackendTLSPolicy is enabled (the client certificate will be used for the Listener connection only, and not reused for the backend connection). This last item - reusing the Listener client cert on a backend connection - feels like something folks may ask for, but we should definitely have out of scope could be another one for the "Deferred" section.

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
@arkodg
Copy link
Contributor Author

arkodg commented Apr 5, 2024

sorry for the force push 🙈

Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arkodg! One minor nit but otherwise LGTM as soon as #2910 merges.

/approve

geps/gep-91/index.md Outdated Show resolved Hide resolved
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arkodg, frankbu, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2024
@robscott
Copy link
Member

robscott commented Apr 6, 2024

I don't think release notes are needed for GEP changes, can save them until the change that actually adds the API spec.

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Apr 6, 2024
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Signed-off-by: Arko Dasgupta <arko@tetrate.io>
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arkodg!

/lgtm

@@ -1,22 +1,164 @@
# GEP-91: Client Certificate Validation for TLS terminating at the Gateway Listener

* Issue: [#91](https://github.com/kubernetes-sigs/gateway-api/issues/91)
* Status: Provisional
* Status: Implementable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this formally moves to experimental, I think we need a corresponding metadata file for this GEP. Not going to block on this specific PR since it can easily be handled in a following.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 8, 2024
@robscott
Copy link
Member

robscott commented Apr 8, 2024

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2024
@k8s-ci-robot k8s-ci-robot merged commit a247bf2 into kubernetes-sigs:main Apr 8, 2024
7 of 8 checks passed
@arkodg arkodg deleted the gep-91-api branch April 9, 2024 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.