Description
openedon Feb 18, 2021
How to categorize this issue?
If multiple identifiers make sense you can also state the commands multiple times, e.g.
/area networking
/kind bug
/priority normal
What happened:
This ticket originates from a slack discussion
When we provision a new cluster, sporadically the pods on a node are stuck in state Container Creating
. There are events saying Pod sandbox changed, it will be killed and re-created.
over and over.
In the logs I can see the calico CNI plugin getting installed:
time="2021-02-17T10:14:40Z" level=info msg="Running as a Kubernetes pod" source="install.go:140"
time="2021-02-17T10:14:40Z" level=info msg="Installed /host/opt/cni/bin/bandwidth"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/calico"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/calico-ipam"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/flannel"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/host-local"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/install"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/loopback"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/portmap"
time="2021-02-17T10:14:41Z" level=info msg="Installed /host/opt/cni/bin/tuning"
time="2021-02-17T10:14:41Z" level=info msg="Wrote Calico CNI binaries to /host/opt/cni/bin\n"
time="2021-02-17T10:14:41Z" level=info msg="CNI plugin version: v3.17.1\n"
time="2021-02-17T10:14:41Z" level=info msg="/host/secondary-bin-dir is not writeable, skipping"
time="2021-02-17T10:14:41Z" level=info msg="Using CNI config template from CNI_NETWORK_CONFIG environment variable." source="install.go:319"
time="2021-02-17T10:14:41Z" level=info msg="Created /host/etc/cni/net.d/10-calico.conflist"
time="2021-02-17T10:14:41Z" level=info msg="Done configuring CNI. Sleep= false"
But according to the slack discussion something must have removed it later so that the error appears.
What you expected to happen:
A regular node being provisioned where pods can run.
How to reproduce it (as minimally and precisely as possible):
Unfortunately, I do not know how to reproduce this. It is sporadic
Anything else we need to know?:
Environment:
- Gardener version (if relevant):
- Extension version:
- Kubernetes version (use
kubectl version
): 1.17.14 - Cloud provider or hardware configuration: Azure
- Others: