Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure external CA #10941

Open
nicolas-goudry opened this issue Feb 22, 2024 · 11 comments
Open

Configure external CA #10941

nicolas-goudry opened this issue Feb 22, 2024 · 11 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@nicolas-goudry
Copy link
Contributor

I need to setup a cluster with Kubespray while using an external CA.

Since this process is not documented, I made a search and read this issue.

I managed to make this work by using the following custom role associated with a playbook:

defaults/main.yaml

use_external_ca: false
root_ca_cert: ""
root_ca_key: ""
root_ca_key_passphrase: ""
etcd_cert_cacert_file: "ca.pem"
etcd_cert_key_file: "ca-key.pem"

tasks/main.yaml

---
- name: external-ca | Ensure root CA
  fail:
    msg: "Missing root CA certificate and/or key (root_ca_cert / root_ca_key)"
  when: root_ca_cert is not defined or root_ca_cert|length == 0 or root_ca_key is not defined or root_ca_key|length == 0
  tags: external-ca

- name: external-ca | Install cryptography Python module
  pip:
    name:
      - cryptography
  register: result
  retries: 3
  delay: 10
  until: result is success

- name: external-ca > etcd | Set etcd cert dir
  set_fact:
    etcd_cert_dir: "{{ kube_cert_dir }}/etcd"
    etcd_cert_cacert_file: "ca.crt"
    etcd_cert_key_file: "ca.key"
  when: etcd_deployment_type == "kubeadm"
  tags: external-ca

- name: external-ca > etcd | Create certs directory
  file:
    path: "{{ etcd_cert_dir }}"
    state: directory

- name: external-ca > etcd | Generate private key
  community.crypto.openssl_privatekey:
    path: "{{ etcd_cert_dir }}/{{ etcd_cert_key_file }}"
    type: RSA
    size: "{{ certificates_key_size }}"
  tags: external-ca

- name: external-ca > etcd | Generate CSR
  community.crypto.openssl_csr_pipe:
    privatekey_path: "{{ etcd_cert_dir }}/{{ etcd_cert_key_file }}"
    common_name: etcd-ca
    use_common_name_for_san: false
    basic_constraints:
      - "CA:TRUE"
    basic_constraints_critical: true
  register: etcd_ca_csr
  tags: external-ca

- name: external-ca > etcd | Generate certificate (root CA without passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ etcd_cert_dir }}/{{ etcd_cert_cacert_file }}"
    csr_content: "{{ etcd_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length == 0
  tags: external-ca

- name: external-ca > etcd | Generate certificate (root CA with passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ etcd_cert_dir }}/{{ etcd_cert_cacert_file }}"
    csr_content: "{{ etcd_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
    ownca_privatekey_passphrase: "{{ root_ca_key_passphrase }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length > 0
  tags: external-ca

- name: external-ca > kube-apiserver | Create certs directory
  file:
    path: "{{ kube_cert_dir }}"
    state: directory

- name: external-ca > kube-apiserver | Generate private key
  community.crypto.openssl_privatekey:
    path: "{{ kube_apiserver_client_key }}"
    type: RSA
    size: "{{ certificates_key_size }}"
  tags: external-ca

- name: external-ca > kube-apiserver | Generate CSR
  community.crypto.openssl_csr_pipe:
    privatekey_path: "{{ kube_apiserver_client_key }}"
    common_name: kubernetes
    use_common_name_for_san: true
    basic_constraints:
      - "CA:TRUE"
    basic_constraints_critical: true
    key_usage:
      - digitalSignature
      - keyEncipherment
      - keyCertSign
    key_usage_critical: true
  register: kube_ca_csr
  tags: external-ca

- name: external-ca > kube-apiserver | Generate certificate (root CA without passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ kube_apiserver_client_cert }}"
    csr_content: "{{ kube_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length == 0
  tags: external-ca

- name: external-ca > kube-apiserver | Generate certificate (root CA with passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ kube_apiserver_client_cert }}"
    csr_content: "{{ kube_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
    ownca_privatekey_passphrase: "{{ root_ca_key_passphrase }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length > 0
  tags: external-ca

- name: external-ca > front-proxy | Generate private key
  community.crypto.openssl_privatekey:
    path: "{{ kube_cert_dir }}/front-proxy-ca.key"
    type: RSA
    size: "{{ certificates_key_size }}"
  tags: external-ca

- name: external-ca > front-proxy | Generate CSR
  community.crypto.openssl_csr_pipe:
    privatekey_path: "{{ kube_cert_dir }}/front-proxy-ca.key"
    common_name: front-proxy-ca
    use_common_name_for_san: true
    basic_constraints:
      - "CA:TRUE"
    basic_constraints_critical: true
    key_usage:
      - digitalSignature
      - keyEncipherment
      - keyCertSign
    key_usage_critical: true
  register: kube_front_proxy_ca_csr
  tags: external-ca

- name: external-ca > front-proxy | Generate certificate (root CA without passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ kube_cert_dir }}/front-proxy-ca.crt"
    csr_content: "{{ kube_front_proxy_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length == 0
  tags: external-ca

- name: external-ca > front-proxy | Generate certificate (root CA with passphrase)
  community.crypto.x509_certificate:
    provider: ownca
    path: "{{ kube_cert_dir }}/front-proxy-ca.crt"
    csr_content: "{{ kube_front_proxy_ca_csr.csr }}"
    ownca_content: "{{ root_ca_cert }}"
    ownca_privatekey_content: "{{ root_ca_key }}"
    ownca_privatekey_passphrase: "{{ root_ca_key_passphrase }}"
  when: root_ca_key_passphrase is defined and root_ca_key_passphrase|length > 0
  tags: external-ca

To better explain:

  • play only runs if use_external_ca is true
  • play fails if custom variables root_ca_cert and root_ca_key are missing
  • play installs cryptography module which is needed by some community.crypto modules
  • play takes into account that etcd certificates directory changes depending on the value of etcd_deployment_type
  • for etcd, kubernetes and front-proxy, play does the following:
    • generates an RSA private key
    • generates a CSR
    • generates a certificate signed by the root CA, using a passphrase if provided

Notes:

  • even with the intermediate CAs generated, the cert_management value must still be set to script in order to generate all the required certificates. Those certificates will be signed by the intermediate CAs generated by the role, since the certificate management scripts don’t try to regenerate the CAs if they already exists.
  • setting cert_management to none would require to generate not only the CAs, but also all certificates.

While this is working fine, I later found that communicating with the API server with curl would fail from any of the control plane hosts (didn’t try from node hosts):

$ curl https://127.0.0.1:6443/version
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

However, if I provide the --cacert flag pointing to /etc/kubernetes/ssl/ca.crt it works. So I added these tasks to the role:

- name: external-ca | add kube-apiserver CA to trusted CA dir
  copy:
    src: "{{ kube_apiserver_client_cert }}"
    dest: "{{ ca_cert_path }}"
    remote_src: true
    mode: 0640
  register: kube_ca_cert

- name: external-ca | update ca-certificates
  command: update-ca-certificates
  when: kube_ca_cert.changed

This way, the kubernetes CA generated and signed by the root CA is trusted by the host, and the above curl command works.


The main issue is that I’m using Ansible’s kubernetes.core.k8s_info module to interact with the cluster and without adding the kubernetes CA to the host trusted CAs it fails with:

Max retries exceeded with url: /version (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get issuer certificate

However, if I let Kubespray generate all CAs and certificates, I do not have this issue. How is that possible? I checked without running my play and I don’t see the Kubernetes CA added to host trusted CAs so I don’t understand how it could work…


As a side note, I think it would be great to have some documentation about the process of using an external CA. If the maintainers think this would be a great addition, I would like to work on this, but I don’t know where to start so some guidance would be greatly appreciated.

Finally, I do think it would be great to include something like the role I wrote directly into Kubespray to allow users to setup their clusters with an external CA like I did. Again, I’m ok to work on this but don’t really know where to start.

I guess my two last points may need to live in other issues, you tell me.

@MrFreezeex
Copy link
Member

MrFreezeex commented Feb 28, 2024

Hi @nicolas-goudry, maybe I didn't fully understood your context but I am a bit confused on your motivation on why you would like to generate certificate outside of kubeadm/kubespray but still makes kubespray handles the heavy works of generating certs with a new role.

I would understand if you would do something along the line of "here is an existing intermediate CA that integrates with my PKI, please kubespray use that". Although I still think this might be a disputed idea since Kubernetes doesn't handle cert revocation which means isolated PKI for each kubernetes clusters might still be the best choice...

About your issues about trusting the kubernetes PKI, AFAIK you shouldn't need this I think. In every places where you would have some kind of token/client cert to contact the Kubernetes API server there is also the associated CA so that clients like kubectl can verify the chain of trust. So for instance the curl command failing that you posted is pretty common into the Kubernetes world and even more when you have distinct PKI per Kubernetes cluster which is also pretty common and even somewhat recommended IMO. So in theory this should work like the regulars certs that kubespray/kubeadm generate however you need to place that at the correct places in the fs and make sure that some of the tools that you might shortcut because you would be handling the certificates yourself does not prevent Kubernetes from being functional. NGL this sounds kinda hard and you would probably need to do small changes/conditions all over the places in kubespray.

So in a nutshell my opinion on this is that it needs to be properly motivated to be integrated as I fear that it might be too much complexity for Kubespray with little benefits and even might encourage bad practices for users that sees this and start using it... I hope you will not find this a too harsh comment as this look like already a lot of works considering the amount of details you put in your issue.

@nicolas-goudry
Copy link
Contributor Author

nicolas-goudry commented Feb 29, 2024

Hi @MrFreezeex, first of all thanks for your thorough answer. Don’t worry, no hard feelings against your opinion, I believe we are here to discuss and debate in a sane way. Everyone has its own point of view and I respect that.

Regarding the motivation, in a nutshell I spin up clusters for my customers using Kubespray. Some of them have expressed the desire to use their own CA in order to sign their cluster’s certificates.

BTW I use the term “root CA” to describe the CA used to sign the intermediate cluster CAs, but it will likely be an intermediate CA generated by the customer. It would end up being a trust chain looking something like that:

root CA                         # customer-owned, not shared with me
└── intermediate CA             # customer-owned, shared with me and provided to custom role
    ├── kubernetes CA           # generated by custom role
    │   └── kubernetes certs    # generated by kubeadm (right?)
    ├── etcd-ca CA              # generated by custom role
    │   └── etcd certs          # generated by Kubespray (Bash script)
    └── front-proxy-ca CA       # generated by custom role
        └── front-proxy certs   # generated by kubeadm (right?)

TBH, I don’t understand myself why someone would need to do such a thing, for all the reasons you expressed, but the need is here so I must comply… If you strongly believe this doesn’t have its right place in Kubespray, I would totally understand that. In the end it’s only one user needs, and I would definitely be ok with my need being fulfilled by the custom role I crafted. I was just wondering if this would be a useful addition to Kubespray.

Now, about the SSL verification issue, I used curl to demonstrate the issue in a simple way, but I know the behavior is the same with a plain Kubespray cluster installation. The real issue is with the kubernetes Ansible module as it behaves differently with Kubespray/kubeadm certificates handling than with my custom role. For a reason that I can’t explain, the following doesn’t work when CA generation is handled by my role:

ansible -u <remote-user> -b --become-user=root -i <inventory> -m pip -a "name=kubernetes" <node>
ansible -u <remote-user> -b --become-user=root -i <inventory> -m kubernetes.core.k8s_info -a "kind=Pod namespace=kube-system" <node>

AFAICT, the module is using the default kubeconfig path, which is /root/.kube/config for the root user (since I’m using become). What doesn’t make sense is that if I use this kubeconfig with kubectl, it works seamlessly!

I compared the certificates with both methods and found out that with plain Kubespray (ie. kubeadm) the kubernetes CA is self-signed. With my custom role, the kubernetes CA is signed by the « root CA » (which is expected since it’s the whole point of the role). I believe that the issue is lying there, but even if I add the « root CA » (which is self-signed) to the trusted CAs of the control plane hosts, the error persists.

If I understand PKI correctly, this shouldn’t happen, since the root CA in the chain is trusted, all subsequent CAs/certs should be trusted as well… Or am I wrong about this?

I checked that the « root CA » is indeed trusted:

$ sudo trust list --filter=ca-anchors | grep -i external -A2 -B2
pkcs11:id=%AD%BD%98%7A%34%B4%26%F7%FA%C4%26%54%EF%03%BD%E0%24%CB%54%1A;type=cert
    type: certificate
    label: AddTrust External CA Root
    trust: anchor
    category: authority

In the end, I think this issue is beyond the scope of Kubespray. Feel free to close it if you want to. I’ll try to get some help on SO, linking back to here. I’d still appreciate it if you have any insights on this matter though 🙂

@MrFreezeex
Copy link
Member

MrFreezeex commented Feb 29, 2024

BTW I use the term “root CA” to describe the CA used to sign the intermediate cluster CAs, but it will likely be an intermediate CA generated by the customer.
TBH, I don’t understand myself why someone would need to do such a thing, for all the reasons you expressed, but the need is here so I must comply… If you strongly believe this doesn’t have its right place in Kubespray, I would totally understand that. In the end it’s only one user needs, and I would definitely be ok with my need being fulfilled by the custom role I crafted. I was just wondering if this would be a useful addition to Kubespray.

Ah ok this makes (a bit) more sense thanks for the additional explanation. So I think it kinda depends on the impact on kubespray if it is really only the changes that you sent + possibly a few more things I would tends to think it should be acceptable. I would put some disclaimers in a few places to suggest that people should try to avoid using this though...

About your issues that is weird indeed I would expect that the kubernetes ansible role would use the config that you are pointing out, maybe you could try overriding some of the parameters here like validate_certs as a mean to check what's happening. FYI we mostly do not kubernetes.core in kubespray as we have our own equivalent as of right now so I am not certain what it does by default tbh. Maybe you could try to see what kubectl command is really execute behind the hood as well.

Apart from that from what I remember the PKI seems right to me, but don't bet on this my knowledge of all the kubernetes PKIs are not very fresh!

If I understand PKI correctly, this shouldn’t happen, since the root CA in the chain is trusted, all subsequent CAs/certs should be trusted as well… Or am I wrong about this?

Not sure in your specific cases but afaik to verify that a cert is trusted it needs to verify the full chain up to the certificate (and thus know all the certs up to that point) it needs to verify so some of your problems might be related to that.

@nicolas-goudry
Copy link
Contributor Author

Ok, I’ll find some time to try and include this into Kubespray with as few changes as possible and massive warnings.


I did some tests today and I think I understand why the kubernetes.core module is not working in this case:

First of all, as a reminder, the cluster CA certificate ends up (no matter what) in the kubeconfig file under clusters[].cluster.certificate-authority-data.

When we let kubeadm generate its PKI, the kubernetes CA is self-signed. But since it is advised to clients in the kubeconfig certificate-authority-data, it is implicitly trusted even if it’s self-signed.

Now, when the kubernetes CA is signed by another CA, nothing changes in the kubeconfig file: only the (now intermediate) kubernetes CA is advised through the certificate-authority-data field. This seems to be the issue. Clients now only get part of the trust chain and therefore may not trust the connection since they cannot verify the whole chain.

I believe (well, it’s more of a guess here) that kubectl is “lazy” and doesn’t check for the whole chain but only looks at the end of the chain (kubernetes CA -> apiserver cert) to validate the connection. Other tools, like kubernetes.core modules, seems to be expecting the whole chain, as specified in the documentation of the ca_cert parameter:

Path to a CA certificate used to authenticate with the API. The full certificate chain must be provided to avoid certificate validation errors.

Therefore, in order to fix this, I either have to:

  • add the kubernetes CA to the host’s trusted CAs
  • provide kubernetes.core modules with the whole certificate chain bundle.
# On some host
cat /etc/kubernetes/ssl/ca.crt /wherever/is/located/customer/provided/ca/cert | sudo tee /etc/ssl/certs/kubernetes-ca-bundle.crt

# On controller
ansible -u <remote-user> -b --become-user=root -i <inventory> -m kubernetes.core.k8s_info -a "kind=Pod namespace=kube-system ca_cert=/etc/ssl/certs/kubernetes-ca-bundle" <node>

I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?

@MrFreezeex
Copy link
Member

I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?

I was not around at that time but I suspect yes as it was first introduced in 2015 it seems... There is more details on a potential shift to the "official" modules here FYI: #10696

@VannTen
Copy link
Contributor

VannTen commented Apr 4, 2024

I noticed that Kubespray uses its own module to interact with the cluster, I’ll give this a try later on. Just to satisfy my curiosity: why did you went this path instead of using the “official” Ansible module for Kubernetes? Is it because the module didn’t exist yet when Kubespray needed it?

Probably that.
Also, kubernetes.core.k8s requires to install python packages on the managed hosts, not only on the ansible control node, and kubespray does not have infrastructure to do that in a fine-grained way (yet, I'm working on it for #10701 )

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 3, 2024
@nicolas-goudry
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 3, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 1, 2024
@nicolas-goudry
Copy link
Contributor Author

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 1, 2024
@rdalbuquerque
Copy link

Hi @nicolas-goudry and @MrFreezeex !

I have a scenario where I need to create an on-premises, secondary, standby k8s cluster in a backup site, it would be a mirror of the primary on-premise k8s cluster.

There are a few solutions to help replicate/migrate workloads and objects from one cluster to another, but I would also like to switch between clusters only by switching the DNS entry of the apiserver.
That's why I think it would be beneficial for such scenario that both these clusters would share the same CA, this way I could achieve the goal of swtching between clusters only configuring DNS entry.

Please let me know if there are better options to achieve my goal.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

6 participants