Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"kubeadm certs" phase uses a hard-coded etcd CA certificate path even in external etcd mode #1276

Closed
seh opened this issue Nov 24, 2018 · 18 comments
Labels
area/security help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@seh
Copy link

seh commented Nov 24, 2018

What keywords did you search in kubeadm issues before filing this one?

  • certificates
  • certs
  • etcd
  • external

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-beta.2", GitCommit:"a4ff09c41589c48547e04f85391aa5610ebe0e17", GitTreeState:"clean", BuildDate:"2018-11-23T00:53:06Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-beta.2", GitCommit:"a4ff09c41589c48547e04f85391aa5610ebe0e17", GitTreeState:"clean", BuildDate:"2018-11-23T00:55:34Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-beta.2", GitCommit:"a4ff09c41589c48547e04f85391aa5610ebe0e17", GitTreeState:"clean", BuildDate:"2018-11-23T00:47:47Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    AWS EC2
  • OS (e.g. from /etc/os-release):
ID=coreos
VERSION=1911.3.0
VERSION_ID=1911.3.0
BUILD_ID=2018-11-05-1815
PRETTY_NAME="Container Linux by CoreOS 1911.3.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
  • Kernel (e.g. uname -a):
    Linux ip-10-105-51-211 4.14.78-coreos #1 SMP Mon Nov 5 17:42:07 UTC 2018 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux

What happened?

I am running kubeadm in "external etcd mode," where I've placed my etcd CA certificate and API etcd client key and certificate on the machine on which I then run kubeadm init. In my ClusterConfiguration manifest within my configuration file, I have the following stanza:

etcd: 
  external: 
    caFile: "/etc/kubernetes/pki/etcd-ca.crt"
    keyFile: "/etc/kubernetes/pki/apiserver-etcd-client.key"
    certFile: "/etc/kubernetes/pki/apiserver-etcd-client.crt"
    endpoints: 
    - "https://etcd0.000-003.kubernetes.local:2379"
    - "https://etcd1.000-003.kubernetes.local:2379"
    - "https://etcd2.000-003.kubernetes.local:2379"

Note the path to the etcd CA certificate file: /etc/kubernetes/pki/etcd-ca.crt.

When I run kubeadm init, it fails as reported in #918:

[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/peer certificate authority generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation
error execution phase certs/apiserver-etcd-client: couldn't load CA certificate etcd-ca: couldn't load the certificate file /etc/kubernetes/pki/etcd/ca.crt: open /etc/kubernetes
/pki/etcd/ca.crt: no such file or directory

What you expected to happen?

kubeadm init should have found the etcd CA certificate at the path specified in my configuration file: /etc/kubernetes/pki/etcd-ca.crt. Instead, it's demanding that I adhere to the hard-coded path /etc/kubernetes/pki/etcd/ca.crt.

How to reproduce it (as minimally and precisely as possible)?

Deny kubeadm init access to the etcd CA key file, but provide it the etcd CA certificate file and the corresponding API server etcd client key and certificate files. In the ClusterConfiguration manifest, populate the "etcd.external" mapping—crucially, the "caFile" field—with a non-default but otherwise valid path.

Run kubeadm init --conifg , and observe it failing in the "certs" phase due to it not finding the etcd CA certificate file where it expected it to be.

Anything else we need to know?

I diagnosed what I think the cause of the problem is in #918, which I'll reproduce here.

In cmd/kubeadm/app/cmd/phases/certs.go's newCertSubPhases function, we use the default certificate set, whether or not etcd is external or local. That default set includes an entry for the API server etcd client triple (CA certificate, client key, and client certificate), and that entry (KubeadmCertEtcdAPIClient) nominates a CA certificate that has a hard-coded basename path, which is defined as etcd/ca.

If we're using an externally hosted etcd, then when we go looking for its CA certificate, we should honor the configuration's ExternalEtcd.CAFile field. Doing that is not so simple, given the current data-driven approach in this procedure.

Perhaps the KubeadmCert type could introduce a "CANameFunc" field of type func(*kubeadmapi.InitConfiguration) (string) (or func(*kubeadmapi.InitConfiguration) (string, error)) that could return a path predicated on whether etcd is hosted locally or externally. Alternately, we could use some interface that delegates to KubeadmCert, but has an alternate implementation that uses an etcd CA certificate path read from configuration.

While we're here, I'll note that the previous path to the etcd CA certificate file that went along with the API server etcd client key pair had been etcd-ca.crt, but now it's etcd/ca.crt. That change warrants a release note given this defect, but if we fix this problem, then the change in the default path wouldn't matter as much.

@seh seh changed the title "kubeadm certs" phase uses a hard-coded etcd CA path even in external etcd mode "kubeadm certs" phase uses a hard-coded etcd CA certificate path even in external etcd mode Nov 24, 2018
@neolit123
Copy link
Member

i'm pretty sure we had the same report of etcd paths not being respected from the config, but i cannot find the issue - @chuckha and me were assigned at first.

this was problematic with upgrades too if i recall correctly.

/assign @fabriziopandini @liztio @chuckha

@neolit123 neolit123 added the kind/bug Categorizes issue or PR as related to a bug. label Nov 24, 2018
@neolit123 neolit123 added this to the v1.14 milestone Nov 24, 2018
@fabriziopandini
Copy link
Member

@seh
I think that the fix here is to check external etcd mode before loading certificates in run function (no need of touching certlist)
@liztio opinions?

@seh
Copy link
Author

seh commented Nov 25, 2018

I was so distracted by the error in front of me concerning the CA file path that I neglected to notice that the client key and certificate paths are also specified in the same ClusterConfiguration stanza, and the "certs" phase doesn't respect them either. It just so happens that the values I had specified matched the default values used by the "certs" phase, so we don't see any divergence.

As for ignoring this triple altogether, the question is whether running in "external etcd mode" means that the "certs" phase won't try to generate a client certificate, for lack of a CA key, or whether it merely means that etcd is not kubeadm's responsibility to run, but it remains willing to create the certificates needed by the control plane (in this case, the etcd client certificate).

Since the ExternalEtcd type has a field for a CA certificate but not one for a CA key, I think we can assume that "external etcd mode" means the former: there's no CA key available, so we can't do anything about creating a client certificate anyway. As part of the pre-flight checks, we already confirm that the client key and certificate allow a successfully authenticated connection to the etcd cluster. Is there anything else we need to check after that?

@timothysc timothysc added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 26, 2018
@fabriziopandini
Copy link
Member

I'm closing this because of this part was refactored during v1.13
If this is still an issue in v1.13, feel free to re-open
/close

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

I'm closing this because of this part was refactored during v1.13
If this is still an issue in v1.13, feel free to re-open
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@seh
Copy link
Author

seh commented Jan 8, 2019

This is still an issue in version 1.13.1.

I don't have the ability to reopen a closed issue, @fabianofranz, but I hope you'll do so.

@timothysc timothysc added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 27, 2019
@Magikon
Copy link

Magikon commented Feb 8, 2019

This is still an issue in version 1.13.3

@nasir03082409229
Copy link

Any one who solved it ?

@yagonobre
Copy link
Member

yagonobre commented Feb 12, 2019

I can't reproduce on 1.13.3

[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/peer certificate authority generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate authority generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key

Can you try with this config?

apiVersion: kubeadm.k8s.io/v1alpha3
kind: ClusterConfiguration
etcd:
  caFile: "/etc/kubernetes/pki/etcd-ca.crt"
  keyFile: "/etc/kubernetes/pki/apiserver-etcd-client.key"
  certFile: "/etc/kubernetes/pki/apiserver-etcd-client.crt"
  endpoints: 
  - "https://etcd0.000-003.kubernetes.local:2379"
  - "https://etcd1.000-003.kubernetes.local:2379"
  - "https://etcd2.000-003.kubernetes.local:2379"

@neolit123
Copy link
Member

@seh @nasir03082409229
can you please verify if this is still present in latest 1.13?

@neolit123 neolit123 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Mar 8, 2019
@neolit123 neolit123 modified the milestones: v1.14, v1.15 Mar 11, 2019
@marc-sensenich
Copy link

@neolit123 I've experienced this same behavior in 1.13.5 with both apiVersion kubeadm.k8s.io/v1alpha3 and kubeadm.k8s.io/v1beta1

@yagonobre
Copy link
Member

Thanks for verify it @marc-sensenich
/lifecycle active
/assign

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label May 18, 2019
@neolit123
Copy link
Member

we need verification for 1.14 and 1.15.
with the addition of the external etcd kinder workflows it might be easier to reproduce:
https://github.com/kubernetes/kubeadm/blob/master/kinder/ci/workflows/external-etcd-master.yaml

cc @ereslibre

@neolit123 neolit123 modified the milestones: v1.15, v1.16 Jun 10, 2019
@neolit123 neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Jul 25, 2019
@fabriziopandini
Copy link
Member

/lifecycle active

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Aug 1, 2019
@fabriziopandini
Copy link
Member

@seh
fix in flight for master; for an older version of kubeadm, the workaround is to avoid file name used by kubeadm for local etcd (e.g. apiserver-etcd-client-->apiserver-external-etcd-client)

@seh
Copy link
Author

seh commented Aug 1, 2019

Thank you! We worked around it by conforming to the expected name, but I look forward to less surprise and more freedom in the next release. I'll go take a look at your kubernetes/kubernetes#80867.

@rosti
Copy link

rosti commented Aug 20, 2019

Since there is a fix merged, I close this one.
@seh feel free to reopen the issue if the problem persists despite the fix.

/close

@k8s-ci-robot
Copy link
Contributor

@rosti: Closing this issue.

In response to this:

Since there is a merge fixed, I close this one.
@seh feel free to reopen the issue if the problem persists despite the fix.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests