Provide proper certificates for kube-scheduler and kube-controller-manager #2244

FrediWeber · 2020-08-04T08:40:27Z

FEATURE REQUEST

Versions

kubeadm version (use kubeadm version): 1.18.6

Environment:

Kubernetes version (use kubectl version): 1.18.6
Cloud provider or hardware configuration: Bare-Metal
OS (e.g. from /etc/os-release): Debian 10
Kernel (e.g. uname -a): 4.19.0-9
Others:

What happened?

Kubeadm disables the "insecure" ports of kube-scheduler and kube-controller-manager by setting the --port=0 flag. Therefore metrics have to be scaped over TLS. This is fine but Kubeadm doesn't seem to manage the certificates of kube-scheduler and kube-controller manager. These components - if no certificate is provided - will create a self signed certificate to serve requests. One could just disable certificate verification but that would somehow defer the use of TLS.

What you expected to happen?

Kubeadm should create and manage certificates for the "secure" port of kube-scheduler and kube-controller-manager. These certificates should be signed by the CA, that is created by Kubeadm.

How to reproduce it (as minimally and precisely as possible)?

Create a cluster with Kubeadm
Access the "secure" port (10257 or 10259)

The text was updated successfully, but these errors were encountered:

neolit123 · 2020-08-04T12:20:30Z

@FrediWeber thank you for logging the ticket.
you have a valid observation that we do not sign the serving certificate and key for the components in question.

we had a long discussion with a user on why we are not singing these for kubeadm and you can read more about this here:
kubernetes/kubernetes#80063

IIUC, one undesired side effect, is that if we start doing that our HTTPS probes will fail, as Pod Probe API does not support signed certificates for HTTPS (only self-signed):
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#probe-v1-core
kubernetes/kubernetes#18226 (comment)

we could workaround that using a "command" probe that is cert/key aware, but this is difficult as the component images are "distroless" (no shell, no tools). so maybe one day we can support that if core k8s supports it properly.

FrediWeber · 2020-08-04T14:22:42Z

@neolit123 Thank you very much for your fast response and the clarifications.

If i understand it correctly, the issue kubernetes/kubernetes#80063 is more about mapping the hole PKI dir into the container, external PKIs and shorter renew intervals.

I read a little bit about health checks with HTTPS in kubernetes/kubernetes#18226 and https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/.
If I understand everything correctly, the Kubernetes docs state, that the certificate is not checked at all for the health check so it shouldn't matter if the certificate is self-signed or if it's properly signed with the already present CA.

If scheme field is set to HTTPS, the kubelet sends an HTTPS request skipping the certificate verification.

I don't see any security implications in just mapping the signed certificates and corresponding keys for kube-scheduler and kube-controller-manager.

The only negative downside would be, that the certificate rotation would have to be managed. On the other hand this is already the case for other certificates AFAIK.

neolit123 · 2020-08-04T14:55:56Z

If I understand everything correctly, the Kubernetes docs state, that the certificate is not checked at all for the health check so it shouldn't matter if the certificate is self-signed or if it's properly signed with the already present CA.

i'm not so sure about this and i haven't tried it. my understanding is that if the server no longer has self-signed certificates this means that it would reject any client connections on HTTPS that do not pass authentication.
e.g. curl -k... would no longer work?

If i understand it correctly, the issue kubernetes/kubernetes#80063 is more about mapping the hole PKI dir into the container, external PKIs and shorter renew intervals.

that is true. however, the discussion there was also about the fact that today users can customize their kube-scheduler and KCM deployments via kubeadm to enable the usage of the custom signed serving certificates for these components if they want their metrics and health checks to be accessible over "true" TLS (i.e. pass the flags and mount the certificates using extraArgs, extraVolumes under ClusterConfiguration).

with the requirement of kubeadm managing these extra certificates for renewal, i'm leaning towards -1 initially, but i would like to get feedback from others too.

cc @randomvariable @fabriziopandini @detiber

FrediWeber · 2020-08-04T15:40:49Z

i'm not so sure about this and i haven't tried it. my understanding is that if the server no longer has self-signed certificates this means that it would reject any client connections on HTTPS that do not pass authentication.
e.g. curl -k... would no longer work?

So do you mean the kube-controller-manager and kube-scheduler would no longer accept the health check requests because they do not pass a client certificate or any other authentication?
Or do you mean the "client side" of the Kubernetes health check would not connect because the certificate is not self signed? I'm not sure about the first case but if the docs are correct, the second case should not happen.

Please also keep in mind, that the current certificate is also not really self-signed. Kube-controller-manager and kube-scheduler seem to create an internal, temporary CA on startup and sign the certificate with their own CA.

You are absolutely right about the existing possibility to mount certificates and set the options with the extraArgs.

The thing that has changed is, that Kubeadm by default deactivates the insecure port with --port=0. I'm aware that this is deprecated in the upstream components (kube-scheduler and kube-controller-manager) anyway but I think that Kubeadm should configure these components properly especially when Kubeadm already "manages" a CA from which these certificates could relative easily be signed.

Another approach would be to let these two components handle their front-facing certificates on their own like kubelet does.

detiber · 2020-08-04T15:43:30Z

Long term, I would love to see the ability to leverage the certificates API to do automated request and renewal of serving certificates for kube-scheduler and kube-controller-manager similar to the work that is being done to enable this support for the kubelet (https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/20190607-certificates-api.md)

neolit123 · 2020-08-04T15:52:20Z

So do you mean the kube-controller-manager and kube-scheduler would no longer accept the health check requests because they do not pass a client certificate or any other authentication?

that was my understanding. then again, we do serve the kube-apiserver on HTTPS and it's probe does not have/pass certficates, so perhaps it would just work.

I'm aware that this is deprecated in the upstream components (kube-scheduler and kube-controller-manager) anyway but I think that Kubeadm should configure these components properly especially when Kubeadm already "manages" a CA from which these certificates could relative easily be signed.

but again the problem is that this is yet another set of certficates that kubeadm has to manage during renewal, and must consider during our "copy certficates" functionality for HA support. it is not a strict requirement and kubeadm already supports it for users that want to do that using extraArgs. we have a similar case for the kubelet serving certificate which is "self-signed".

i'd say, at minimum it would be worthy of a enhancement proposal (KEP):
https://github.com/kubernetes/enhancements/tree/master/keps

neolit123 · 2020-08-04T15:57:03Z

Long term, I would love to see the ability to leverage the certificates API to do automated request and renewal of serving certificates for kube-scheduler and kube-controller-manager similar to the work that is being done to enable this support for the kubelet (https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/20190607-certificates-api.md)

i tried to follow the latest iterations of the CSR API closely, but i have not seen discussions around CSRs for the serving certificates of these components via the KCM CA. my guess would be that there might be some sort of a blocker for doing that, given a lot of planning went in the v1 of the API.

FrediWeber · 2020-08-05T11:31:39Z

i'd say, at minimum it would be worthy of a enhancement proposal (KEP):
https://github.com/kubernetes/enhancements/tree/master/keps

Would it be okay for you if I'd start the process?

neolit123 · 2020-08-05T11:55:03Z

for a feature that is already possible via the kubeadm config/API, the benefits need to justify the maintenance complexity.
after all, kubeadm's goal is to create a "minimal viable cluster" by default.

to me it always seems better to first collect some support (+1s) on the idea before proceeding with the KEP...
the KEP process can be quite involved and my estimate is that at this stage the KEP will not pass. so, it is probably better to flesh out the idea more in a discussion here.

FrediWeber · 2020-08-05T12:32:11Z

What if the kube-scheduler and kube-controller-manager would manage their front facing server certificate with the certificates.k8s.io API? There would need to be a new controller to automatically sign the CSRs.
As a fallback these components could still use their self created CA.

Check if the flags for certificates are set - if yes use them and start the component
If no front facing certificate flags are set, try to generate the CSR and let it sign by the corresponding controller
If step 2 fails or is disabled, proceed in the same way as today (generate own CA etc.)

Kubeadm wouldn't have to do anything and there would still be the possibility to provide own certificates if needed.

neolit123 · 2020-08-05T20:38:38Z

this could work, but i guess we will have to own the source code and container image for this new controller.

BTW, does the kube-scheduler even support /metrics?

for 1.18.0 it just reports:

no kind is registered for the type v1.Status in scheme

KCM on the other hand reports what i've expected to see:

  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/metrics\"",
  "reason": "Forbidden",
  "details": {

FrediWeber · 2020-08-06T11:02:24Z

I just double checked it.
The problem seems to be, that the scheduler does not provide a clean error message if not properly authenticated.
If you authenticate against the /metrics endpoint with a token of an authorized service account, the metrics are provided.

neolit123 · 2020-08-06T11:15:35Z

The problem seems to be, that the scheduler does not provide a clean error message if not properly authenticated.

would you care logging a ticket for that in kubernetes/kubernetes and tag with /sig scheduling instrumentation?
i could not find an existing one by searching in the repository.

fejta-bot · 2020-11-10T11:55:44Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

neolit123 · 2020-11-10T15:05:16Z

/remove-lifecycle stale

ksa-real · 2021-01-19T22:23:48Z

I was trying to set up metrics scraping with Prometheus and am pretty confused about the current state of things. Is there a recommended way to monitor kube-scheduler (S) and kube-controller-manager (CM) metrics?

prometheus-operator/kube-prometheus#718

Prometheus is running on the node, different from a master node, and is expected to scrape S/CM metrics from multiple master nodes. Two issues:

Both S and CM bind to 127.0.0.1 by default which makes it impossible to access metrics by Prometheus. Binding to 0.0.0.0 (current recommended workaround) is too wide-scope as it may bind to external IP addresses and expose metrics to the internet. etcd and apiserver's approach with --advertise-client-url with node IP and probably 127.0.0.1 seem like a better idea.
Moving to HTTPS it seems reasonable to have TLS certificates on this endpoint for a client (Prometheus) to trust it. As a workaround it is possible to set insecureSkipVerify: true in Prometheus scrape config, but obviously, it is better not cut corners. X509v3 Subject Alternative Name should be similar to etcd one: DNS:localhost, DNS:node-1, IP Address:<node IP, e.g. 10.1.1.1>, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1

Authentication already happens via service bearer token.

I can create another issue, but first, prefer to get some feedback as the current issue is related.

neolit123 · 2021-01-19T22:48:51Z

I was trying to set up metrics scraping with Prometheus and am pretty confused about the current state of things. Is there a recommended way to monitor kube-scheduler (S) and kube-controller-manager (CM) metrics?

you can sign certficates for a user that is authorized to access /metrics endpoints.
e.g.

rules:
- nonResourceURLs: ["/metrics"]
  verbs: ["get", "post"]

https://kubernetes.io/docs/reference/access-authn-authz/rbac/

creating certificates:
https://kubernetes.io/docs/concepts/cluster-administration/certificates/

you can then feed such to a TLS client that tries to access the endpoint.
can be verified locally with curl too.

alternatively for the legacy behavior of insecure metrics you can grant the user system:anonymous /metrics access.
not really recommended.

EDIT:

Authentication already happens via service bearer token.

sorry missed that part. in that case there is likely lack of authz.

ksa-real · 2021-01-20T02:59:03Z

I think you didn't understand me. To access the metrics endpoint 3 things must happen:

Port must be accessible.
Assuming this is https, the client must trust the server (or opt to not care).
The server must trust the client (or opt to not care).

You are talking about (3), but this is the only part that works. Parts (1) and (2) are broken.

Prometheus is executed on some node (probably non-master), pod IP is e.g. 10.2.1.4. Even if it is master, it must scrape other master nodes as well, and it discovers S and CM via corresponding K8s services, so it gets node IP addresses (e.g. 10.1.1.1, 10.1.1.2 ...) and not 127.0.0.1. With this, 10.1.1.1:10257 and 10.1.1.1:10259 are inaccessible from anywhere including 10.2.1.4. As I said, one workaround is to add the following to kubeadm config:

controllerManager:
  extraArgs:
    bind-address: 0.0.0.0
scheduler:
  extraArgs:
    bind-address: 0.0.0.0

And then propagate to configs via

kubeadm init phase control-plane scheduler --config kubeadm.yml
kubeadm init phase control-plane control-manager --config kubeadm.yml
# then restart kubelet

That works not great in my case. I have an interface facing internet (e.g. 80.1.1.1, 80.1.1.2 ...) on the nodes. Binding to 0.0.0.0 also binds to 80.0.0.x, and metrics become available over internet. I can stop using kubeadm and manually fix the manifests to bind to 10.1.1.x (different value on each node), but at this most likely is going to break components talking to S/CM because AFAIU it is not possible to bind to both 127.0.0.1 and 10.1.1.x, and because of (2).

For a client to trust a server, the server must supply a server TLS certificate signed by CA trusted by the client with no discrepancies. Not sure if Prometheus checks the CA, but it certainly checks IP Address entries in X509v3 Subject Alternative Name certificate part. With --bind-address=127.0.0.1 Prometheus gives dial tcp 10.1.1.1:10257: connect: connection refused. With --bind-address=0.0.0.0 Prometheus reports x509: certificate is valid for 127.0.0.1, not 10.1.1.1.
This part works. When I add insecureSkipVerify: true to Prometheus scrape configs, it successfully scrapes the metrics. The scrape config:

- job_name: monitoring/main-kube-prometheus-stack-kube-scheduler/0
  scheme: https
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: false
...

The service account token already has the get metrics permission in its role:

rules:
- nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
  verbs: ["get"]

ksa-real · 2021-01-20T11:17:59Z

My current workaround is to apply the firewall rules on every node BEFORE applying the above steps. Replace the 10.1.1.1 with the node IP:

cat <<EOF >/etc/local.d/k8s-firewall.start
iptables -A INPUT -p tcp -d 127.0.0.1,10.1.1.1 -m multiport --dports 10257,10259 -j ACCEPT
iptables -A INPUT -p tcp -m multiport --dports 10257,10259 -j DROP
EOF
chmod +x /etc/local.d/k8s-firewall.start
/etc/local.d/k8s-firewall.start

ksa-real · 2021-01-20T14:07:49Z

The issue is NOT specific to Prometheus. This is specific to kubeadm. The way kubeadm sets up scheduler and controller-manager unless the metrics collector is deployed on every master node, any pull-based central metrics scraper

cannot read metrics from these components
cannot trust the certificates, provided by these components

What is the point in providing metrics endpoint if it cannot be accessed? I guess, kubeadm doesn't do it on purpose, right?

neolit123 · 2021-01-20T14:25:56Z

I guess, kubeadm doesn't do it on purpose, right?

yes, because it feels like an extension and not something that all users would need.

the discussion above had the following:

... today users can customize their kube-scheduler and KCM deployments via kubeadm to enable the usage of the custom signed serving certificates for these components if they want their metrics and health checks to be accessible over "true" TLS (i.e. pass the flags and mount the certificates using extraArgs, extraVolumes under ClusterConfiguration).

#2244 (comment)

so technically you should be able to pass extra flags to the components and set them up.

fejta-bot · 2021-05-09T19:13:56Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fabriziopandini · 2021-05-10T12:24:57Z

/remove-lifecycle stale.

fejta-bot · 2021-06-09T12:45:21Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

fabriziopandini · 2021-06-10T12:39:07Z

/remove-lifecycle rotten

k8s-triage-robot · 2021-10-03T20:01:07Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-04-11T11:58:05Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

centromere · 2023-02-06T22:34:50Z

/remove-lifecycle stale

neolit123 added area/security priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. kind/feature Categorizes issue or PR as related to a new feature. labels Aug 4, 2020

neolit123 added the kind/design Categorizes issue or PR as related to design. label Aug 4, 2020

FrediWeber mentioned this issue Aug 6, 2020

Kube-Scheduler metrics misleading error message if not authenticated kubernetes/kubernetes#93748

Closed

neolit123 added this to the v1.20 milestone Aug 12, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 10, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 10, 2020

neolit123 modified the milestones: v1.20, v1.21 Dec 2, 2020

ksa-real mentioned this issue Jan 19, 2021

KubeControllerManagerDown & kubeSchedulerDown firing on kubeadm 1.18 cluster prometheus-operator/kube-prometheus#718

Open

neolit123 modified the milestones: v1.21, v1.22 Feb 8, 2021

weibeld mentioned this issue Feb 10, 2021

Unable to obtain kube-controller-manager and kube-scheduler monitoring information prometheus-operator/kube-prometheus#913

Closed

Jeffwan mentioned this issue Feb 12, 2021

Support prometheus scratching for control plane which disables anonymous-auth kubernetes/perf-tests#1734

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 9, 2021

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 10, 2021

neolit123 modified the milestones: v1.22, v1.23 Jul 5, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 3, 2021

neolit123 removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 21, 2021

neolit123 modified the milestones: v1.23, v1.24 Nov 23, 2021

neolit123 modified the milestones: v1.24, Next Jan 11, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 11, 2022

neolit123 added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 11, 2022

matofeder mentioned this issue Apr 25, 2024

Ensure security hardening of the KaaS monitoring solution - inner cluster communication SovereignCloudStack/issues#495

Open

9 tasks

abiwot mentioned this issue Oct 3, 2024

TLS error when upstream is HTTPS brancz/kube-rbac-proxy#312

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide proper certificates for kube-scheduler and kube-controller-manager #2244

Provide proper certificates for kube-scheduler and kube-controller-manager #2244

FrediWeber commented Aug 4, 2020

neolit123 commented Aug 4, 2020

FrediWeber commented Aug 4, 2020

neolit123 commented Aug 4, 2020

FrediWeber commented Aug 4, 2020

detiber commented Aug 4, 2020

neolit123 commented Aug 4, 2020 •

edited

Loading

neolit123 commented Aug 4, 2020 •

edited

Loading

FrediWeber commented Aug 5, 2020

neolit123 commented Aug 5, 2020 •

edited

Loading

FrediWeber commented Aug 5, 2020

neolit123 commented Aug 5, 2020

FrediWeber commented Aug 6, 2020

neolit123 commented Aug 6, 2020

fejta-bot commented Nov 10, 2020

neolit123 commented Nov 10, 2020

ksa-real commented Jan 19, 2021 •

edited

Loading

neolit123 commented Jan 19, 2021 •

edited

Loading

ksa-real commented Jan 20, 2021 •

edited

Loading

ksa-real commented Jan 20, 2021

ksa-real commented Jan 20, 2021

neolit123 commented Jan 20, 2021

fejta-bot commented May 9, 2021

fabriziopandini commented May 10, 2021

fejta-bot commented Jun 9, 2021

fabriziopandini commented Jun 10, 2021

k8s-triage-robot commented Oct 3, 2021

k8s-triage-robot commented Apr 11, 2022

centromere commented Feb 6, 2023

Provide proper certificates for kube-scheduler and kube-controller-manager #2244

Provide proper certificates for kube-scheduler and kube-controller-manager #2244

Comments

FrediWeber commented Aug 4, 2020

FEATURE REQUEST

Versions

What happened?

What you expected to happen?

How to reproduce it (as minimally and precisely as possible)?

neolit123 commented Aug 4, 2020

FrediWeber commented Aug 4, 2020

neolit123 commented Aug 4, 2020

FrediWeber commented Aug 4, 2020

detiber commented Aug 4, 2020

neolit123 commented Aug 4, 2020 • edited Loading

neolit123 commented Aug 4, 2020 • edited Loading

FrediWeber commented Aug 5, 2020

neolit123 commented Aug 5, 2020 • edited Loading

FrediWeber commented Aug 5, 2020

neolit123 commented Aug 5, 2020

FrediWeber commented Aug 6, 2020

neolit123 commented Aug 6, 2020

fejta-bot commented Nov 10, 2020

neolit123 commented Nov 10, 2020

ksa-real commented Jan 19, 2021 • edited Loading

neolit123 commented Jan 19, 2021 • edited Loading

ksa-real commented Jan 20, 2021 • edited Loading

ksa-real commented Jan 20, 2021

ksa-real commented Jan 20, 2021

neolit123 commented Jan 20, 2021

fejta-bot commented May 9, 2021

fabriziopandini commented May 10, 2021

fejta-bot commented Jun 9, 2021

fabriziopandini commented Jun 10, 2021

k8s-triage-robot commented Oct 3, 2021

k8s-triage-robot commented Apr 11, 2022

centromere commented Feb 6, 2023

neolit123 commented Aug 4, 2020 •

edited

Loading

neolit123 commented Aug 4, 2020 •

edited

Loading

neolit123 commented Aug 5, 2020 •

edited

Loading

ksa-real commented Jan 19, 2021 •

edited

Loading

neolit123 commented Jan 19, 2021 •

edited

Loading

ksa-real commented Jan 20, 2021 •

edited

Loading