Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-4317: Pod Certificates #4318

Merged
merged 1 commit into from
Oct 10, 2024
Merged

Conversation

ahmedtd
Copy link
Contributor

@ahmedtd ahmedtd commented Oct 30, 2023

This PR adds the initial content for KEP-4317, which outlines machinery for:

  • Representing pod identities in X.509 certificates and certificate requests,
  • Generically provisioning certificates to pods,
  • Specifically provisioning kube-apiserver client certificates to pods, and
  • Using pod identity certificates to authenticate to kube-apiserver,.

This KEP is still in the draft/discussion phase. Significant changes are anticipated based on review and discussion in SIG Auth.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 30, 2023
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Oct 30, 2023
@ahmedtd ahmedtd force-pushed the pod-certificates branch 2 times, most recently from 9c8542f to e1fc884 Compare October 30, 2023 04:04
@ahmedtd ahmedtd mentioned this pull request Nov 1, 2023
5 tasks
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
actually been scheduled onto that node.

<<[UNRESOLVED]>>
Is it necessary or desirable to forbid non-Kubelet identities from creating CertificateSigningRequest objects that embed a PodIdentity extension?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anything prevent non-Kubelets from adding a Pod binding in TokenRequest, today? noderestriction doesn't appear to limit it, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is forbidden at some level. But I guess we should just try it out.

ServiceAccountName string `asn1:"utf8"`
PodName string `asn1:"utf8"`
PodUID string `asn1:"utf8"`
NodeName string `asn1:"utf8"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these fields required? If they are, I think we might one day regret not making the node name / uid optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the extension data itself, all fields are optional, in that you can always leave them as the empty string. (I don't think the ASN.1 parser in Go tries to handle field presence or absence).

If the extension appears in a CertificateSigningRequest from a kubelet, the node restriction plugin will enforce that all the fields are populated, and consistent with reality.

If the extension appears in an issued certificate, I think it's up to the component that's validating the certificate to decide which fields it cares about. For the specific example of certificates issued by the kubernetes.io/kube-apiserver-client-pod signer/approver, we probably want kube-apiserver to check all of the fields before letting the request past the authentication stage.

Copy link

@bleggett bleggett Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clearly documenting a minimum, non-negotiable subset of required fields that are provably sufficient to guarantee unambiguous pod<->cert binding in Kubernetes is required, or including this ASN.1 OID field as an overall KEP requirement isn't much use.

Otherwise they can be left all empty, enforcement is implementation-dependent, and I can be fully compliant with this KEP and use a single cert for every pod in the cluster depending on whatever internal attestation logic the CSR admission controller I'm using happens to employ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can add a detailed proposal for exactly which fields need to be set in order for kube-apiserver to allow authentication via a certificate containing a PodIdentity extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on discussion in API review, we are going to go with a dedicated PodCertificateRequest type, rather than encoding these details in the existing CertificateSigningRequest spec.request field. This drastically lowers the importance of the PodIdentity extension, since it will only be used by the kube-apiserver-client-pod signer, not every third-party signer that wants to integrate with pod certificates.

following DER-encoded structure:

```go
type podIdentityASN1 struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat off-topic, but... What would it take to network bind these certificates? E.g. assuming we can trust the underlying network to prevent IP spoofing, could we include a node or pod source IP here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's basically it --- we need to write the IP here, have kube-apiserver cross-check it during admission, and then ensure it ends up in the PodIdentity extension for the final cert. Then kube-apiserver can check it during TLS termination.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't even have to be kube-apiserver that validates during admission, IIUC?

E.g. you could use SPIRE's attesting agent to validate the admission request for the cert and reject if out-of-band validations fail, or any other thing.

keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
@sftim
Copy link
Contributor

sftim commented Nov 4, 2023

Is this related to ClusterTrustBundles?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Here's a bunch of feedback - a mix of style nits and opinions.

keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/kep.yaml Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
@aojea
Copy link
Member

aojea commented Nov 6, 2023

/cc @aojea

@ahmedtd
Copy link
Contributor Author

ahmedtd commented Nov 17, 2023

Is this related to ClusterTrustBundles?

They're intended to work together. It's not needed for the kube-apiserver client certificate use case, but if someone were to build a pod-to-pod mTLS solution on top of this, they would need to use both ClusterTrustBundle and PodCertificate projected volumes.

@aramase
Copy link
Member

aramase commented Jan 19, 2024

/cc


### PodIdentity Extension Node Restriction

The noderestriction admission plugin will check the consistency of PodIdentity extensions in CertificateSigningRequest objects created by `system:node:` identities. The creation request will only be admitted if the PodIdentity extension refers to a pod that is currently running on the node that requested the certificate.
Copy link

@bleggett bleggett Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that this admission plugin is purely a reference impl, and people are free to write their own admission plugins which do more (or less).

...which also means the admission criteria is entirely implementation-dependent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node restriction plugin is built in to kube-apiserver. It can probably be disabled, but I don't think it is on most distributions?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so the noderestriction admission plugin is the default "floor" and people can always add more strict/granular admission plugins on top. IIUC, that makes sense.

actually been scheduled onto that node.

<<[UNRESOLVED]>>
Is it necessary or desirable to forbid non-kubelet identities from creating CertificateSigningRequest objects that embed a PodIdentity extension?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I can write my own admission plugin and use that instead of the reference impl, I can implement whatever restrictions I like here, which makes it less important how strict or not the reference impl is.

My $0.02 is that the reference impl should restrict to the kubelet identity, at least initially.

Comment on lines 334 to 508
The kubelet supports only a few key generation strategies, each named by a string
that encapsulates both the key type and any parameters necessary (for example,
RSA modulus size). The intention is to offer a a tasting menu of reasonable key
choices, rather than offering a flexibility to a wide variety of parameters.
Key types supported by kubelet are "RSA2048", "RSA3072", "RSA4096", "ECDSAP256",
and "ECDSAP384". If no key type is specified, kubelet defaults to "ECDSAP256".

Copy link

@bleggett bleggett Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tell you from experience that no matter what algos and param defaults you pick, people are going to want to customize beyond those defaults pretty quickly for various compliance and internal reasons, and "shipping your own custom kubelet" is a bit of a lift to allow that.

In general I think it would be best to minimize the kubelet opinions and lack of flexibility around PKI algos and params.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that Kubelet has to generate the key, so fundamentally if we want to allow the user to exercise option X, then Kubelet needs to implement option X.

However, I'm not sure there really are any other choices to make for private keys. RSA lets you pick the modulus, which is specified here, and ECDSA lets you pick a curve, which is specified here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that Kubelet has to generate the key, so fundamentally if we want to allow the user to exercise option X, then Kubelet needs to implement option X.

Yes, that's a bit unfortunate. I do understand it's an unavoidable limitation of the goals/non-goals here tho.

keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
following DER-encoded structure:

```go
type podIdentityASN1 struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't even have to be kube-apiserver that validates during admission, IIUC?

E.g. you could use SPIRE's attesting agent to validate the admission request for the cert and reject if out-of-band validations fail, or any other thing.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 14, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 13, 2024
@soltysh
Copy link
Contributor

soltysh commented Oct 1, 2024

@ahmedtd please don't forget to creating a PRR file (see template in https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/template/nnnn.yaml) and answering the questions, feel free to assign me for the PRR approval.

@ahmedtd
Copy link
Contributor Author

ahmedtd commented Oct 2, 2024

@ahmedtd please don't forget to creating a PRR file (see template in https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/template/nnnn.yaml) and answering the questions, feel free to assign me for the PRR approval.

I added the yaml file to the PR. However, I don't see any questions in it. Are you referring to the PRR questions in the main KEP body?

@ahmedtd
Copy link
Contributor Author

ahmedtd commented Oct 2, 2024

Capturing some notes from discussion with Jordan:

  • For Alpha, we are unsure about how much we want existing things that understand client certs to understand these new certs. To keep our options open, we will put a random UID as the CN, and not encode groups. In the future, we might decide to encode a meaningful CN and groups.
  • New flag for pod client certificate CA, no sort of fallback logic.
  • On kube-apiserver, continue to use one flag for all client certificate CAs.
  • Use PEM in the PCR.

All of these points are now reflected in the text.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few comments from PRR review

keps/sig-auth/4317-pod-certificates/README.md Show resolved Hide resolved
* Indicate failure in issuing the certificate by updating the status subresource
of the PodCertificateRequest, adding a "Failed" condition.

PodCertificateRequest validation logic will:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the validation described here is strictly about the apiserver-side validation, not the kubelet. The important distinction is that this kind of validation will happen only during create/update actions.
Below, there's a description of noderestriction admission plugin, which does a consistency check combining the PCR with an actual pod.

compromising a single node only results in compromises of workloads that have
actually been scheduled onto that node.

During alpha, the ability to store PodCertificate objects (and the node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An important note: feature gates are only for the development phase, once this feature reaches GA they will be removed.

keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
### Rollout, Upgrade and Rollback Planning

<!--
This section must be completed when targeting beta to a release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below sections although not required and won't block alpha are worthy of your time to try to tackle some of the answers. I believe you've already pointed out some issues with scalability after node restart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This holds, especially the rollout when you have api enablement + 3 various feature gates. Although I believe the overall recommendation will be to turn all at once.

projected:
sources:
- podCertificate:
signerName: "kubernetes.io/kube-apiserver-client-pod"
Copy link
Member

@stlaz stlaz Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less about the usability, more about the maintenance.

You have to register multiple attributes for the RDNs to start with but:

  • You don't have to write your own ASN.1 parser because this is just DN parsing.
  • You don't have to port your parser into multiple languages
  • You can reuse the RDNs for other purposes and you can combine them at will

I think this might be worth mentioning in alternatives at least.

keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Show resolved Hide resolved

This extension, and the utility functions for manipulating it, are developed for
use by core Kubernetes components, although third parties could make use of
them. The extension should only be changed in backwards-compatible ways.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would a backward-compatible change look like in this case? As per RFC5280 - Section 4.2:

  A certificate-using system MUST reject the certificate if it encounters
   a critical extension it does not recognize or a critical extension
   that contains information that it cannot process.

From Kubernetes perspective, adding an optional field would be considered a backward-compatible change, but the way I understand the above definition blocks this change as that would add information that the older versions of the parsers cannot process.

It's very much possible my interpretation of the RFC is too pedantic. I wonder if there's anyone in the community that would be able to confirm/deny.

Copy link
Contributor Author

@ahmedtd ahmedtd Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, we don't need to mark the extension as critical. The only other content of the certificate will be some random bytes in the Subject.Name field, so it is unlikely that anything bad can happen if you present the cert to an application that doesn't understand the extension (or our custom RDNs, if we go that route).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To close on this, I think we will mark the extension as critical. If we need to evolve the contents of the extension in the future, we will need to do so taking into consideration the behavior of TLS libraries when parsing the extension.

@ahmedtd ahmedtd force-pushed the pod-certificates branch 2 times, most recently from a07b445 to ab42b03 Compare October 3, 2024 22:22
@ahmedtd ahmedtd changed the title KEP-4317: PodIdentity certificates, PodCertificate volumes, and in-cluster kube-apiserver client certificates KEP-4317: Pod Certificates Oct 3, 2024
responsibility for enforcing node restriction on PodCertificateRequests into
kube-apiserver, so that signer implementations do not need to consider it.

### Non-Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's worth mentioning explicitly as a non-goal that this proposal does not cover signing serving certificates for pods. that seems obvious and covered in the pod-to-pod mTLS point, but implicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a non-goal for the built-in signer to issue serving certs (although we will probably mark the certs to allow server cert usage; it doesn't hurt anything).

However, it's an explicit goal for third-party signers to be able to use the PodCertificate projected volume machinery to be able to issue serving certificates to pods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a non-goal for the built-in signer to issue serving certs (although we will probably mark the certs to allow server cert usage; it doesn't hurt anything).

I would not expect us to mark these certs as anything but client certs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified in the text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not expect us to mark these certs as anything but client certs.

Agreed. As discussed on slack, we will only set the client cert usages. This doesn't preclude users from using them as server certs; since they will need to write custom TLS validation code, they can just ignore the lack of the "server" usage bits.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PRR for alpha is ok, although I'd still encourage filling in the metrics and the scalability secions

/approve

For Alpha:
* The feature will be gated behind feature flags and alpha API
enablement.
* An e2e test will exercise the feature in order to prevent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't enable alpha feature gates in our regular e2es, so make sure to add them to appropriate test suite which does it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. We had to do the same thing for ClusterTrustBundles.

<!--
Additionally, for Alpha try to enumerate the core package you will be touching
to implement this enhancement and provide the current unit coverage for those
in the form of:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: but the above comment explicitly requires both the directory and coverage 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage tab crashes in Chrome :/ I need to dig up a Firefox install

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always just run go test -race -cover <package> 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Based on the above analysis, we may want to consider making kube-scheduler aware
of which kubelet versions support PodCertificate projected volumes (and which
key types are supported by version), and try to steer pods towards nodes that
can successfully mount them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do that kind of checks anywhere I'm aware of, so I'd advise against it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather make sure to provide sufficient information in the logs or through events which would allow a user to identify such a situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed that paragraph.

### Rollout, Upgrade and Rollback Planning

<!--
This section must be completed when targeting beta to a release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This holds, especially the rollout when you have api enablement + 3 various feature gates. Although I believe the overall recommendation will be to turn all at once.

@ahmedtd
Copy link
Contributor Author

ahmedtd commented Oct 8, 2024

@soltysh

This holds, especially the rollout when you have api enablement + 3 various feature gates. Although I believe the overall recommendation will be to turn all at once.

I'm sorry, can you elaborate? Are you saying that I should go ahead and fill this section out?

@ahmedtd ahmedtd force-pushed the pod-certificates branch 2 times, most recently from 0d9057d to 4a5849c Compare October 8, 2024 22:12
Copy link
Member

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass.

responsibility for enforcing node restriction on PodCertificateRequests into
kube-apiserver, so that signer implementations do not need to consider it.

### Non-Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a non-goal for the built-in signer to issue serving certs (although we will probably mark the certs to allow server cert usage; it doesn't hurt anything).

I would not expect us to mark these certs as anything but client certs.

* A new TLS user-mapping capability in kube-apiserver to understand client
certificates that embed PodIdentity extensions.
* A new PodCertificateRequest signer
(`kubernetes.io/kube-apiserver-client-pod`), shipped in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but I don't think "we will adequately document the feature" is something we need to state in the KEP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I mean, actually document the signer here, in the KEP itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is? In the section "API Server Pod Client Certificate Approver / Signer".

[ED25519 Specification](https://ed25519.cr.yp.to/) (as implemented by
the golang library crypto/ed25519.Sign).
* Details of the requesting Pod's identity:
* Namespace (indicated by the namespace the PodCertificateRequest is created in).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namespace UID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, actually, I'm not quite sure here. The namespace is indicated by which namespace the PodCertificateRequest (and pod, and service account) are actually in

I'm not sure it makes sense to embed the UID of the namespace in the PodCertificateRequest, although we could have kubernetes.io/kube-apiserver-client-pod embed the UID in the issued certificate.

Comment on lines +191 to +197
* If the key is an RSA key, then the signature is over the ASCII bytes of the
pod UID, using RSASSA-PKCS1-V1_5-SIGN from RSA PKCS #1 v1.5 (as implemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need a timestamp or similar to prevent replay?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how important this is --- the primary security guarantees here come from ensuring that only the correct kubelet can request a certificate for a given pod. The proof of possession is just to prevent (implausible) accidents, as far as I can tell from https://www.rfc-editor.org/rfc/rfc4211#appendix-C

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And https://www.rfc-editor.org/rfc/rfc4211#section-4 only says that the proof of possession should sign over some representation of the identity being requested. It doesn't include a timestamp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay per my reading of that RFC, I think the pod UID is sufficient when combined with all the other node admission checks (i.e. we know that this exact pod is scheduled on this exact node with this exact SA in the same namespace).

If a different node tried to replay the PoP, it would fail because that pod isn't scheduled to it.

keps/sig-auth/4317-pod-certificates/README.md Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
keps/sig-auth/4317-pod-certificates/README.md Outdated Show resolved Hide resolved
authenticates to kube-apiserver using mTLS.

I add a new projected volume to my pod, containing the information that
client-go expects at a well-known path. I set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems really awkward to require this very manual process, especially with magic well known paths on disk in play.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, we can think about automatically injecting these as a replacement for the existing kube-apiserver-access volumes. I didn't want to cover that in Alpha because there's already enough work just landing the machinery without thinking about driving adoption.

However, I think there's a larger conversation to be had on this topic in SIG Auth. My personal thoughts are:

  • Automatically granting API access credentials by default is probably a mistake. Most workloads in a Kubernetes cluster do not actually need to talk to the Kubernetes API.
  • Mutating admission controllers (especially webhooks) cause a lot of problems. In general, workload authors expect the spec they write to be the spec that gets loaded into the cluster.
  • In a perfect world, I would make workload authors who want Kubernetes API access opt in by inserting a token (or pod certificate volume).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am fine with this as-is, but lets discuss further in a future meeting.


- `<package>`: `<date>` - `<test coverage>`

##### Integration tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we could manually run enough of the kcm/kubelet pieces during integration tests to exercise the various node restriction and new signer bits.

@soltysh
Copy link
Contributor

soltysh commented Oct 9, 2024

@soltysh

This holds, especially the rollout when you have api enablement + 3 various feature gates. Although I believe the overall recommendation will be to turn all at once.

I'm sorry, can you elaborate? Are you saying that I should go ahead and fill this section out?

I mean that it would be nice to see this section filled in, but it's not a hard requirement for alpha. Although usually going through that kind of exercise allows unlocking potential problems, which then leads to better designs of the feature.

Copy link
Member

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor stuff, lgtm.

* It uses a CA specified by a new pair of private key / root certificate flags
(although implementations are free to set this flag to point at the CA as
another signer).
* It issues certificates that are valid for 24 hours, and tells Kubelet to begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a client asks for a shorter TTL? What is the min TTL this signer supports?

Copy link
Contributor Author

@ahmedtd ahmedtd Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will update so that it issues certs for min(maxExpirationSeconds, 24hours).

Edit: And we can rely on the new 1 hour minimum for maxExpirationSeconds.

Comment on lines +262 to +263
* If the creator of the PodCertificateRequest left `maxExpirationSeconds` at 0,
then kube-apiserver will default it to 24 hours.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow-up -> decide if we need to support both a max exp seconds per signer and a default per signer.

* Ensure that all fields in PodCertificateRequest.Status are immutable once the
request has been issued, denied, or failed.

The `maxExpirationSeconds` field will have the following logic:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably need a min for this field -> does not make sense to let clients ask for 1 second certs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updated with a min of 1 hour. It should be safe to drop that down in the future.

@enj
Copy link
Member

enj commented Oct 9, 2024

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2024
@enj
Copy link
Member

enj commented Oct 9, 2024

@soltysh approved PRR here #4318 (review) but looks like the bot must have missed the GH event, tagging it manually.

@enj enj added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: ahmedtd, enj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 448d36a into kubernetes:master Oct 10, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.