Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods using ephemeral volumes stuck terminating after kubelet restart #1027

Closed
shanecunningham opened this issue Aug 18, 2021 · 13 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@shanecunningham
Copy link

shanecunningham commented Aug 18, 2021

/kind bug

What happened?

After a k8s in-place upgrade pods with ephemeral volumes were stuck in terminating state after a delete.

What you expected to happen?

Ephemeral volume pods to delete properly.

How to reproduce it (as minimally and precisely as possible)?

Deploy a pod using ephemeral volume, restart kubelet on the host, delete the pod.

Anything else we need to know?:

From what I can tell, the kubelet restart is what puts the pod/volume in a bad state which results in the pod getting stuck terminating.

After a kubelet restart, logs show a problem mounting. This volume is already used by the pod.

Aug 02 11:36:20 ip-10-224-7-112 kubelet[8015]: I0802 11:36:20.689732    8015 server.go:1176] Started kubelet

...

Aug 02 11:36:32 ip-10-224-7-112 kubelet[8015]: E0802 11:36:32.113029    8015 csi_attacher.go:306] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = Internal desc = could not format "/dev/xvdba" and
mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount": mount failed: exit status 32
Aug 02 11:36:32 ip-10-224-7-112 kubelet[8015]: Mounting command: mount
Aug 02 11:36:32 ip-10-224-7-112 kubelet[8015]: Mounting arguments: -t ext4 -o defaults /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount
Aug 02 11:36:32 ip-10-224-7-112 kubelet[8015]: Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount: /dev/nvme1n1 already mounted on /var/lib/kubelet/plu
gins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount.
Aug 02 11:36:32 ip-10-224-7-112 kubelet[8015]: E0802 11:36:32.113164    8015 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/ebs.csi.aws.com^vol-0638613c3667bdee9 podName: nodeName:}" f
ailed. No retries permitted until 2021-08-02 11:36:32.613131544 +0000 UTC m=+17.330809961 (durationBeforeRetry 500ms). Error: "MountVolume.MountDevice failed for volume \"pvc-29100722-3147-49bc-ab59-4b875c12c2bd\"
(UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0638613c3667bdee9\") pod \"processor-685944896-cztrr\" (UID: \"f5d9ebe0-5856-4a6a-9a3a-1e12057597dd\") : rpc error: code = Internal desc = could not fo
rmat \"/dev/xvdba\" and mount it at \"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount\": mount failed: exit status 32\nMounting command: mount\nMounting arguments:
 -t ext4 -o defaults /dev/xvdba /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount\nOutput: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49b
c-ab59-4b875c12c2bd/globalmount: /dev/nvme1n1 already mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-29100722-3147-49bc-ab59-4b875c12c2bd/globalmount.\n"

This error is repeated. Then a delete is issued for the pod. The volumeattachment is deleted.

Aug 10 08:51:49 ip-10-224-7-112 kubelet[8015]: E0810 08:51:49.658184    8015 csi_attacher.go:621] kubernetes.io/csi: VolumeAttachment [csi-4f4d36dcfd0ada333d033e6df3f385579869959554d96279c46e0c21bd5d08e4] has delet
ion timestamp, will not continue to wait for attachment
Aug 10 08:51:49 ip-10-224-7-112 kubelet[8015]: E0810 08:51:49.658263    8015 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/ebs.csi.aws.com^vol-0638613c3667bdee9 podName: nodeName:}" f
ailed. No retries permitted until 2021-08-10 08:53:51.658235204 +0000 UTC m=+681456.375913631 (durationBeforeRetry 2m2s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-29100722-3147-49bc-ab59-4b875c12c2
bd\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0638613c3667bdee9\") pod \"processor-685944896-cztrr\" (UID: \"f5d9ebe0-5856-4a6a-9a3a-1e12057597dd\") : volume attachment is being deleted"

Since the volumeattachment has beed deleted, GETs start to fail.

Aug 10 08:53:51 ip-10-224-7-112 kubelet[8015]: I0810 08:53:51.666840    8015 operation_generator.go:556] MountVolume.WaitForAttach entering for volume "pvc-29100722-3147-49bc-ab59-4b875c12c2bd" (UniqueName: "kubernetes.io/csi/ebs.csi.aws.com^vol-0638613c3667bdee9") pod "processor-685944896-cztrr" (UID: "f5d9ebe0-5856-4a6a-9a3a-1e12057597dd") DevicePath ""
Aug 10 08:53:51 ip-10-224-7-112 kubelet[8015]: E0810 08:53:51.669045    8015 csi_attacher.go:159] kubernetes.io/csi: attacher.WaitForAttach failed for volume [vol-0638613c3667bdee9] (will continue to try): volumeattachments.storage.k8s.io "csi-4f4d36dcfd0ada333d033e6df3f385579869959554d96279c46e0c21bd5d08e4" is forbidden: User "system:node:<worker_node>" cannot get resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: no relationship found between node '<worker_node>' and this object

The previous log is repeated until kubelet is restarted. This allows the volume/pod to be cleaned up. However, I found if another ephemeral volume lands on this host, it runs into the same stuck terminating issue as before and requires another kubelet restart.

Environment

  • Kubernetes version (use kubectl version): 1.20.8
  • Driver version: v1.1.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 18, 2021
@nirmalaagash
Copy link
Contributor

@shanecunningham Thanks for bringing this up. I am afraid I was not able to see any logs from the csi_attacher.go or nestedpendingoperations.go. Can you please check if you were using in-tree driver by any chance. Are those kubelet logs and can you please explain where you got those logs from?
Otherwise, I tried to reproduce the issue and the pod got stuck at terminating after deletion. Later, I tried the same with the recent fix #1019 , the pod seems to delete properly. Please check if the problem persists with the recent fix.

@shanecunningham
Copy link
Author

shanecunningham commented Sep 8, 2021

@nirmalaagash Thank you for looking into this. My logs were all from kubelet. I'm not using in-tree, I'm seeing this problem across both migrated volumes and new volumes created with the CSI driver.

I tested the newest release v.1.2.0 and fix #1019 seems to partially fix it. If I manually restart kubelet on a node with a pod using a volume, the pod still hangs stuck on terminating. I see the same mount failed: exit status 32 errors as I posted above. To resolve this it still requires a kubelet restart. But after the kubelet restart, the pod can now be deleted as many times as needed, unlike before where each pod delete required a kubelet restart.

So this still affects volumes after a process that restarts all kubelets, such as a Kubernetes version upgrade.

Can you replicate this or have any suggestions?

@nirmalaagash
Copy link
Contributor

@shanecunningham help me reproduce this issue. Can you please explain the steps in detail? (manifest that you used to create ephemeral volume, storage class, log level on each containers, how you collected those kubelet logs)

@shanecunningham
Copy link
Author

@nirmalaagash Since I posted this issue I've confirmed it's happening on all volumes, not just ephemeral. But I'll outline steps for ephemeral volumes.

  1. Manifest used
kind: Deployment
apiVersion: apps/v1
metadata:
  name: ephemeral-1
  labels:
    app: ephemeral
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ephemeral
  template:
    metadata:
      labels:
        app: ephemeral
    spec:
      containers:
        - name: ephemeral-container
          image: busybox
          volumeMounts:
          - mountPath: "/scratch"
            name: ephemeral-volume
          command: [ "sleep", "1000000" ]
      volumes:
        - name: ephemeral-volume
          ephemeral:
            volumeClaimTemplate:
              metadata:
                labels:
                  type: ephemeral-volume
              spec:
                accessModes: [ "ReadWriteOnce" ]
                storageClassName: "gp3"
                resources:
                  requests:
                    storage: 1Gi
  1. StorageClass
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
  name: gp3
parameters:
  encrypted: "true"
  type: gp3
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
  1. SSH to the worker node running the pod
  2. Restart kubelet on the worker node. systemctl restart kubelet. This produces the mount 32 errors logs in kubelet.
  3. Delete the pod. Pod is hanging on Termination status here for me.
  4. Restart kubelet on the node. systemctl restart kubelet
  5. Pod and volume are now able to be cleaned up

Logs level are default as far as I know, v=2 for driver containers and default for kubelet. I collected the logs by sshing to the worker node and using journalctl .

@nirmalaagash
Copy link
Contributor

Thanks @shanecunningham I'll look into it.

@nirmalaagash
Copy link
Contributor

@shanecunningham Like you mentioned, I could see the issue of pod stuck at terminating with v1.2.0. Please use the release-1.2 branch which has recent aws-ebs-csi-driver v1.2.1 image tag. Below is the screenshot of the pod not stuck at terminating using v1.2.1. Let me know if this works.
ss

@shanecunningham
Copy link
Author

@nirmalaagash Yesterday I tried using the k8s.gcr.io/provider-aws/aws-ebs-csi-driver:v1.2.1 image by editing the daemonset and was still seeing the pods get stuck. Let me try redeploying using the helm chart, looks like it was just updated for 1.2.1, maybe I missed a change somewhere else.

@nirmalaagash
Copy link
Contributor

nirmalaagash commented Sep 10, 2021

@shanecunningham You should edit both the deployment 'ebs-csi-controller' and daemonset 'ebs-csi-node' in kube-system namespace for the changes to reflect. Also, you can try it with the helm chart.

@shanecunningham
Copy link
Author

shanecunningham commented Sep 10, 2021

@nirmalaagash Ah, I did forget the controllers. So I updated both to v1.2.1 and I can still reproduce this. After a kubelet restart there are the mounting errors being logged and the pod is never terminated until another kubelet restart.

Sep 10 21:40:25 ip-10-00-00-00 kubelet[21787]: E0910 21:40:25.584515   21787 csi_attacher.go:306] kubernetes.io/csi: attacher.MountDevice failed: rpc error: code = Internal desc = could not format "/dev/xvdbb" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount": mount failed: exit status 32
Sep 10 21:40:25 ip-10-00-00-00 kubelet[21787]: Mounting arguments: -t ext4 -o defaults /dev/xvdbb /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount
Sep 10 21:40:25 ip-10-00-00-00 kubelet[21787]: Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount: /dev/nvme2n1 already mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount.
Sep 10 21:40:25 ip-10-00-00-00 kubelet[21787]: E0910 21:40:25.584670   21787 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/ebs.csi.aws.com^vol-0000 podName: nodeName:}" failed. No retries permitted until 2021-09-10 21:42:27.584624345 +0000 UTC m=+513.123972916 (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"pvc-e8ba0262-6018-456f-964f-1857bec1323e\" (UniqueName: \"kubernetes.io/csi/ebs.csi.aws.com^vol-0000\") pod \"ephemeral-1-7cf598487d-b6d58\" (UID: \"5a2b7eb1-78ac-452e-83fa-5a8d2acd9826\") : rpc error: code = Internal desc = could not format \"/dev/xvdbb\" and mount it at \"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount\": mount failed: exit status 32\nMounting command: mount\nMounting arguments: -t ext4 -o defaults /dev/xvdbb /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount\nOutput: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount: /dev/nvme2n1 already mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount.\n"

These are the logs from ebs-plugin on this node.

E0910 21:40:25.584075       1 mount_linux.go:179] Mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/xvdbb /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount: /dev/nvme2n1 already mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount.

E0910 21:40:25.584139       1 driver.go:119] GRPC error: rpc error: code = Internal desc = could not format "/dev/xvdbb" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount": mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/xvdbb /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount: /dev/nvme2n1 already mounted on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e8ba0262-6018-456f-964f-1857bec1323e/globalmount.
W0910 21:40:26.373823       1 node.go:166] NodeStageVolume: invalid partition config, will ignore. partition = 0
I0910 21:40:26.383149       1 mount_linux.go:353] `fsck` error fsck from util-linux 2.33.1
/dev/nvme1n1 is mounted.
e2fsck: Cannot continue, aborting.

Versions I'm running.

kubectl get ds ebs-csi-node -n kube-system -o json | jq '.spec.template.spec.containers[].image'

"k8s.gcr.io/provider-aws/aws-ebs-csi-driver:v1.2.1"
"k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.1.0"
"k8s.gcr.io/sig-storage/livenessprobe:v2.2.0"

kubectl get deploy ebs-csi-controller -n kube-system -o json | jq '.spec.template.spec.containers[].image'

"k8s.gcr.io/provider-aws/aws-ebs-csi-driver:v1.2.1"
"k8s.gcr.io/sig-storage/csi-provisioner:v2.1.1"
"k8s.gcr.io/sig-storage/csi-attacher:v3.1.0"
"k8s.gcr.io/sig-storage/csi-resizer:v1.0.0"
"k8s.gcr.io/sig-storage/livenessprobe:v2.2.0"

@shanecunningham
Copy link
Author

These logs are being repeated from the controllers ebs-plugin container.

I0910 22:12:54.859760       1 cloud.go:601] Waiting for volume "vol-02713a4c57083c998" state: actual=detaching, desired=detached

@nirmalaagash
Copy link
Contributor

@shanecunningham From your logs, I could see that the problem is in the NodeStageVolume. But I have few things to clarify. I am not sure why there are two device names, /dev/nvme2n1 in both of the mount outputs and in last before line there is another device name /dev/nvme1n1. So I am not sure if there are two devices involved. I would be happy to discuss it more on the slack channel with the same github username.

@nirmalaagash
Copy link
Contributor

@shanecunningham I am still having trouble reproducing the issue. I tried enabling the feature gate "GenericEphemeralVolume=true" for the ephemeral volume to work in v1.20.x Kubernetes Cluster and tried the deployment. I was facing the below error.

Failed to get Plugin from volumeSpec no volume plugin matched

Please provide the configuration used in the cluster like feature gates that are enabled and any other configuration made specific to the cluster.

@wongma7
Copy link
Contributor

wongma7 commented Oct 8, 2021

this should be fixed by #1082 which just merged. The fix will be backported and released in v1.3.x

@wongma7 wongma7 closed this as completed Oct 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants