-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to add new master/etcd node to cluster #3471
Comments
The feature of scaling |
I'm facing the same issue with adding new masters. I'm using Kubespray v2.10.x and the reason it fails is that Kubespray does not update the apiserver certificates to add the new master to the SAN list. You can check your certificate with
... and the new master IP and hostname should be listed in the
The execution of NOTE: The works for v2.10.x. I never tested this in older versions of Kubespray. In your first master, recreate the apiserver certificate.
If you are doing this after you ended up with a broken master, be sure to run Run |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
is it possible to add master? or replace failed master for new one? |
You should be able to. In the past, we managed to replace all nodes in the cluster: master, etcd and workers. But.... there are some misteps you need to be carefull along the way. After a lot of experiments and retries in our lab environment, we came up with a few guidelines. Adding/replacing a master node1) Recreate apiserver certs manually to include the new master node in the cert SAN field.For some reason, Kubespray will not update the apiserver certificate. Edit Use kubeadm to recreate the certs.
Check the certificate, new host needs to be there.
2) Run
|
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Can https://github.com/kubernetes-sigs/kubespray/blob/48a182844c9c3438e36c78cbc4518c962e0a9ab2/docs/recover-control-plane.md be applied for adding new master/etcd nodes? @qvicksilver |
@yujunz Not sure, haven't really tried that use case. Also I'm a bit unsure of the state of that playbook. Haven't had time to add it to CI. But please do try. |
The procedure to add\remove masters belongs in the readme, not hidden away in a comment in this issue. |
To be sure everybody see this, this was PR in #5570 and you can now find it here https://kubespray.io/#/docs/nodes |
I think this line doesn't work anymore, there is no k8s_nginx-proxy_nginx-proxy pod. |
Hello! quersys@node1:/etc/kubernetes$ sudo kubeadm init phase certs apiserver --config kubeadm-config.yaml |
What version of K8s are you using? It's been almost a year since I posted. Did something change in kubeadm since then? I would start by searching for official instructios on how to renew and recreate certs. |
quersys@node1:/etc/kubernetes$ kubectl version I tried to find information about 6 hours :( |
Those look like warnings. Most people seem to ignore them. Are you sure no error messages appear as well? Does it hang and never return? If that's the case, I'd wait for a timeout to hopefully get some actual error messages. |
Yeh, also I have timeout error with my node, when I use cluster.yml and tried to add this node to master
10 авг. 2020 г. 17:10 пользователь Julio H Morimoto написал:
```
W0810 11:08:48.479307 31818 utils.go:26] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
W0810 11:08:48.479525 31818 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
```
Those look like warnings. Most people seem to ignore them. Are you sure no error messages appear as well? Does it hang and never return? If that's the case, I'd wait for a timeout to hopefully get some actual error messages.
…--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#3471 (comment)
|
Sounds like a conectivity problem or something that leads to it. If you can provide any further logs and relevant messages, it would be helpful. |
Thanks!!! I have my new node ip in apiserver.crt DNS:node1, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:localhost, DNS:node1, DNS:node3, DNS:lb-apiserver.kubernetes.local, DNS:node1.cluster.local, DNS:node3.cluster.local, IP Address:10.233.0.1, IP Address:172.26.1.225, IP Address:172.26.1.225, IP Address:10.233.0.1, IP Address:127.0.0.1, IP Address:172.26.1.225, IP Address:172.26.1.130 but when I use command ansible-playbook -i inventory/quersyscluster/hosts.yml cluster.yml I have problem connection "timeout" |
Please post relevant log messages for more context. At this level, "connection timeout" is a broad error message. |
Hi, Story: I tried to force master2 to be the first one, but when I do a join task on new master (eg. master4) it looks like kubeadm still want to connect to master1 (6.0.1.57):
Present, eg.: worker1, centos7.7 Expected: worker1, centos7.8 How did you manage with recreate first master? @juliohm1978 , maybe can you help? Thanks! |
Current:
Added node3 to group etcd:
Now i have 3 master/etcd and 45 nodes, i've already refrenced #1122 but couldn't fix it.I extended etcd success but master failed.It shows "kubectl" error:
Unable to connect to the server: x509: certificate is valid for "new master ip"
And my extend command is:
ansible-playbook -i inventory/mycluster/host.ini cluster.yml -l master1,master2,master3,master4,master5
My kubernetes cluster version is 1.9.3,how to fix it?
The text was updated successfully, but these errors were encountered: