Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to start ContainerManager #1587

Closed
zar3bski opened this issue Sep 22, 2020 · 11 comments
Closed

Failed to start ContainerManager #1587

zar3bski opened this issue Sep 22, 2020 · 11 comments
Labels
kind/support Question with a workaround

Comments

@zar3bski
Copy link

All my pods end up stuck to Pending

kubectl get -n kube-system pods                                                                   
NAME                                         READY   STATUS    RESTARTS   AGE
calico-kube-controllers-847c8c99d-t4kqm      0/1     Pending   0          164m
calico-node-rmc8z                            0/1     Pending   0          164m
coredns-86f78bb79c-gr5bf                     0/1     Pending   0          138m
hostpath-provisioner-5c65fbdb4f-bb698        0/1     Pending   0          137m
metrics-server-8bbfb4bdb-vnlcl               0/1     Pending   0          136m
kubernetes-dashboard-7ffd448895-69nf7        0/1     Pending   0          136m
dashboard-metrics-scraper-6c4568dc68-xxh2c   0/1     Pending   0          136m

Because the default node does not seem available

kubectl get events --all-namespaces | grep -i kubernetes-dashboard-7ffd448895-69nf7                                                
kube-system   10s         Warning   FailedScheduling          pod/kubernetes-dashboard-7ffd448895-69nf7        0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

It seems stuck in a weird loop

kubectl describe nodes
.....
  Normal   Starting                 37s    kubelet, bifrost  Starting kubelet.
  Warning  InvalidDiskCapacity      37s    kubelet, bifrost  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  37s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientMemory
  Normal   Starting                 30s    kubelet, bifrost  Starting kubelet.
  Warning  InvalidDiskCapacity      30s    kubelet, bifrost  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientPID     30s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientPID
  Normal   NodeHasNoDiskPressure    30s    kubelet, bifrost  Node bifrost status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientMemory  30s    kubelet, bifrost  Node bifrost status is now: NodeHasSufficientMemory
  Normal   Starting                 24s    kubelet, bifrost  Starting kubelet.
.....

And I wonder whether has something to do with what I found in journalctl

microk8s.daemon-kubelet[1001446]: E0922 12:22:28.089018 1001446 kubelet.go:1765] skipping pod synchronization - container runtime status check may not have completed yet
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.091302 1001446 kubelet_node_status.go:70] Attempting to register node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102758 1001446 kubelet_node_status.go:108] Node bifrost was previously registered
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.102861 1001446 kubelet_node_status.go:73] Successfully registered node bifrost
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103699 1001446 cpu_manager.go:184] [cpumanager] starting with none policy
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103734 1001446 cpu_manager.go:185] [cpumanager] reconciling every 10s
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103766 1001446 state_mem.go:36] [cpumanager] initializing new in-memory state store
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.103994 1001446 state_mem.go:88] [cpumanager] updated default cpuset: ""
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104003 1001446 state_mem.go:96] [cpumanager] updated cpuset assignments: "map[]"
microk8s.daemon-kubelet[1001446]: I0922 12:22:28.104012 1001446 policy_none.go:43] [cpumanager] none policy: Start
microk8s.daemon-kubelet[1001446]: F0922 12:22:28.104037 1001446 kubelet.go:1296] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/snap/microk8s/common/var/lib/kubelet": could not find device with major: 0, minor: 28 in cached partitions map
microk8s.daemon-kubelet[1001446]: goroutine 337 [running]:
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0xc000010001, 0xc00060d800, 0xfd, 0xfd)
microk8s.daemon-kubelet[1001446]:         /build/microk8s/parts/k8s-binaries/build/go/src/github.com/kubernetes/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:996 +0xb9
microk8s.daemon-kubelet[1001446]: k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x6ef8140, 0xc000000003, 0x0, 0x0, 0xc0005c0070, 0x6b48993, 0xa, 0x510, 0x0)
.....

How could I fix this? (If that's the reason why it fails)

inspection-report-20200922_135021.tar.gz

@WereCatf
Copy link

@zar3bski This looks like the same problem I already made an issue about a couple of days back. At least in my case it seems to be because I am using Btrfs.

@zar3bski
Copy link
Author

zar3bski commented Sep 22, 2020

@WereCatf quite a different story: it used to work on my Ubuntu 20.04. What did change was that I moved to a new flat where my public ip is an Ipv6. @stgraber mentioned here
something about ipv6 support but I am not sure which conf file I should edit

@WereCatf
Copy link

@zar3bski But were you using Microk8s 1.19 on Ubuntu 20.04, or an earlier Microk8s? Microk8s 1.19 refuses to work for me, but 1.18 works fine.

@zar3bski
Copy link
Author

snap probably upgraded my cluster at one point. I'll give it a shoot

@zar3bski
Copy link
Author

Thanks @ktsakalozos , it is probably it. Sorry for the noob question but I am not quite used to snap yet. Where should I put the kubelet conf file suggested by #80633 so it is taken into account?

  • /snap/microk8s/${the_actual_version}/etc ?

@ktsakalozos
Copy link
Member

@zar3bski It looks like this is a feature gate [1] setup. You will need to configure kubelet with the respective feature gate [2] (--feature-gates=....). The kubelet arguments are placed in /var/snap/microk8s/current/args/kubelet and after editing that file you will need to restart MicroK8s with microk8s.stop; microk8s.start;.

[1] https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
[2] https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

@zar3bski
Copy link
Author

zar3bski commented Sep 23, 2020

Adding

--feature-gates="LocalStorageCapacityIsolation=false"

to /var/snap/microk8s/current/args/kubelet solved the issue. Many thanks @ktsakalozos !
What about you, @WereCatf ?

@WereCatf
Copy link

@ktsakalozos @zar3bski Yes, adding that feature-gate seems to workaround the issue and microk8s seems to be running now. It's a rather ugly workaround, but it's better than nothing.

@BenTheElder
Copy link

👋 Kubernetes featuregates are only available until the feature goes GA (or is removed), and then the featuregate is removed and the feature is just on (or removed), LocalStorageCapacityIsolation is going GA in Kubernetes v1.25.0, but I've requested a kubelet option to enable use cases like this and there will be a kubelet option localStorageCapacityIsolation you can set instead (no idea how to do that with microk8s, sorry).

@BenTheElder
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Question with a workaround
Projects
None yet
Development

No branches or pull requests

5 participants