Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

none upgrade flaky: waiting for k8s-app=kube-dns: timed out waiting for the condition #5166

Closed
tstromberg opened this issue Aug 21, 2019 · 5 comments

Comments

@tstromberg
Copy link
Contributor

tstromberg commented Aug 21, 2019

(NOTE: This is while testing #5032 - this may not occur at head)

On gLinux, starting with a minikube v1.3.0 cluster running kubernetes v1.10.x:

sudo /usr/local/bin/minikube start --vm-driver=none --kubernetes-version=v1.10.4

and then upgrading it to head:

sudo ./out/minikube start --vm-driver=none --kubernetes-version=v1.15.2

occassionally results in:

😄  minikube v1.3.1 on Debian rodete                                                                                                                                                                                                               
👍  Upgrading from Kubernetes 1.10.4 to 1.15.2                                                                                                                                                                                                     
🏃  Using the running none "minikube" VM ...                                                                                                                                                                                                       
⌛  Waiting for the host to be provisioned ...
🐳  Preparing Kubernetes v1.15.2 on Docker 18.09.3 ...                                                                                                                                                                                             
🚜  Pulling images ...                                                                                                                                                                                                                             
🔄  Relaunching Kubernetes using kubeadm ...                                                                                                                                                                                                       
🤹  Configuring local host environment ...
...
⌛  Waiting for: apiserver proxy etcd scheduler controller dns
💣  Wait failed: waiting for k8s-app=kube-dns: timed out waiting for the condition                                                                                                                                                                 
@medyagh
Copy link
Member

medyagh commented Aug 21, 2019

seems to be duplicate of #5161

@tstromberg
Copy link
Contributor Author

docker ps -a doesn't show a DNS container:

CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                    PORTS                    NAMES
98fd73ee3afc        2c4adeb21b4f           "etcd --advertise-cl…"   11 minutes ago      Up 11 minutes                                      k8s_etcd_etcd-minikube_kube-system_5a44ab6ac57a427cee60b7249ced8d2a_0
78aa014a0cee        88fa9cb27bd2           "kube-scheduler --bi…"   11 minutes ago      Up 11 minutes                                      k8s_kube-scheduler_kube-scheduler-minikube_kube-system_d3369c9fdf84ba7b0b269f7d3411275e_0
a343123a2252        9f5df470155d           "kube-controller-man…"   11 minutes ago      Up 11 minutes                                      k8s_kube-controller-manager_kube-controller-manager-minikube_kube-system_1af2414dda933be25592f26ad4e175cd_0
abb47c418c34        k8s.gcr.io/pause:3.1   "/pause"                 11 minutes ago      Up 11 minutes                                      k8s_POD_etcd-minikube_kube-system_5a44ab6ac57a427cee60b7249ced8d2a_0
47564af42c11        34a53be6c9a7           "kube-apiserver --ad…"   11 minutes ago      Up 11 minutes                                      k8s_kube-apiserver_kube-apiserver-minikube_kube-system_5105bc549cff782f672b96c5beb38939_0
57b8921e5db8        k8s.gcr.io/pause:3.1   "/pause"                 11 minutes ago      Up 11 minutes                                      k8s_POD_kube-scheduler-minikube_kube-system_d3369c9fdf84ba7b0b269f7d3411275e_0
88a8e2d1e91d        k8s.gcr.io/pause:3.1   "/pause"                 11 minutes ago      Up 11 minutes                                      k8s_POD_kube-controller-manager-minikube_kube-system_1af2414dda933be25592f26ad4e175cd_0
5f0a17341bd8        k8s.gcr.io/pause:3.1   "/pause"                 11 minutes ago      Up 11 minutes                                      k8s_POD_kube-apiserver-minikube_kube-system_5105bc549cff782f672b96c5beb38939_0
d845d57f5ea5        119701e77cbc           "/opt/kube-addons.sh"    11 minutes ago      Up 11 minutes                                      k8s_kube-addon-manager_kube-addon-manager-minikube_kube-system_65a31d2b812b11a2035f37c8a742e46f_0
06e770f7bbae        k8s.gcr.io/pause:3.1   "/pause"                 11 minutes ago      Up 11 minutes                                      k8s_POD_kube-addon-manager-minikube_kube-system_65a31d2b812b11a2035f37c8a742e46f_0

apiserver appears to be in rough shape:

E0821 22:31:09.338700       1 authentication.go:65] Unable to authenticate the request due to an error: [invalid bearer token, Token has been invalidated]
E0821 22:31:10.358560       1 authentication.go:65] Unable to authenticate the request due to an error: [invalid bearer token, Token has been invalidated]
E0821 22:31:10.361049       1 authentication.go:65] Unable to authenticate the request due to an error: [invalid bearer token, Token has been invalidated]
E0821 22:35:32.802968       1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
I0821 22:37:23.522159       1 log.go:172] http: TLS handshake error from [::1]:51328: EOF
E0821 22:41:14.965414       1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}

This says to me that components may be running that are configured for another auth system. Not sure what that's about. kube-scheduler is unhappy:

E0821 22:42:37.882301       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods                                         
E0821 22:42:37.882557       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods                                                                                        
E0821 22:42:37.882614       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods                                   
E0821 22:44:07.882766       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods           
E0821 22:44:07.883550       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods                                         
E0821 22:44:07.883742       1 scheduler.go:485] error selecting node for pod: no nodes available to schedule pods                                         

All of the binaries appear to be the correct version.

@tstromberg
Copy link
Contributor Author

seems to be duplicate of #5161

It is possibly related, but #5161 needs more data before I'm confident of it. In this case, no amount of waiting is sufficient to fix the issue. I feel like there is a kubeadm race condition here.

The kubelet and apiserver are apparently happy:

host: Running
kubelet: Running
apiserver: Running
kubectl: Correctly Configured: pointing to minikube-vm at 172.31.120.49

The apiserver however is not aware of any pods.. this may actually be specific to #5032.

 sudo /usr/local/bin/kubectl get po -A                                                                                                                Wed 21 Aug 2019 03:52:24 PM PDT
NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE
kube-system   gvisor                                  0/1     Pending   0          21m
kube-system   kubernetes-dashboard-75bd7dd769-wmt9n   0/1     Pending   0          21m
kube-system   storage-provisioner                     0/1     Pending   0          21m

@tstromberg
Copy link
Contributor Author

One possibility is that this may be due to us not properly transitioning etcd data from /data/minikube to /data/minikube/data.

@tstromberg
Copy link
Contributor Author

Fixed within #5032

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants