Skip to content

Disk attachment/mounting problems, all pods with PVCs stuck in ContainerCreating #1278

Closed

Description

What happened:
Pods with PVCs are stuck in ContainerCreating state, due to a problem with attachment/mounting.

I am using a VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks) cluster. Following a crash for the Kafka pods (using Confluent helm charts v5.3.1; see configuration below, under Environment), 2 of the 3 got stuck in the ContainerCreating state. The dashboard seems to show that all the PVCs are failing to mount because of one volume that has not been detached properly:

kafka-cp-kafka
Unable to mount volumes for pod "kafka-cp-kafka-0_default(kafkapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-kafka-0". list of unmounted volumes=[datadir-0]. list of unattached volumes=[datadir-0 jmx-config default-token-xxxcc]

kafka-cp-zookeeper
AttachVolume.Attach failed for volume "pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "kafka-cp-zookeeper-0_default(zookepod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-zookeeper-0". list of unmounted volumes=[datadir datalogdir]. list of unattached volumes=[datadir datalogdir jmx-config default-token-xxxcc]

es-data-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-data-efk-logging-cluster-default-0_logging(eslogdpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-data-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]

es-master-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-master-efk-logging-cluster-default-0_logging(eslogmpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-master-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]

prometheus-prom-prometheus-operator-prometheus
AttachVolume.Attach failed for volume "pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "prometheus-prom-prometheus-operator-prometheus-0_monitoring(promppod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"prometheus-prom-prometheus-operator-prometheus-0". list of unmounted volumes=[prometheus-prom-prometheus-operator-prometheus-db]. list of unattached volumes=[prometheus-prom-prometheus-operator-prometheus-db config config-out prometheus-prom-prometheus-operator-prometheus-rulefiles-0 prom-prometheus-operator-prometheus-token-xxxee]

alertmanager-prom-prometheus-operator-alertmanager
AttachVolume.Attach failed for volume "pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "alertmanager-prom-prometheus-operator-alertmanager-0_monitoring(promapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"alertmanager-prom-prometheus-operator-alertmanager-0". list of unmounted volumes=[alertmanager-prom-prometheus-operator-alertmanager-db]. list of unattached volumes=[alertmanager-prom-prometheus-operator-alertmanager-db config-volume prom-prometheus-operator-alertmanager-token-xxxff]

Running kubectl get pvc shows the PVC in Bound state (full YAML-JSON from Dashboard below in Environment):

NAMESPACE    NAME                                                                                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
default      datadir-0-kafka-cp-kafka-0                                                                                   Bound    pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-0-kafka-cp-kafka-1                                                                                   Bound    pvc-kafkad02-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-0-kafka-cp-kafka-2                                                                                   Bound    pvc-kafkad03-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-0                                                                                 Bound    pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-1                                                                                 Bound    pvc-zooked02-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-2                                                                                 Bound    pvc-zooked03-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-0                                                                              Bound    pvc-zookel01-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-1                                                                              Bound    pvc-zookel02-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-2                                                                              Bound    pvc-zookel03-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
logging      es-data-es-data-efk-logging-cluster-default-0                                                                Bound    pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
logging      es-data-es-master-efk-logging-cluster-default-0                                                              Bound    pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   alertmanager-prom-prometheus-operator-alertmanager-db-alertmanager-prom-prometheus-operator-alertmanager-0   Bound    pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   prom-grafana                                                                                                 Bound    pvc-grafad01-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   prometheus-prom-prometheus-operator-prometheus-db-prometheus-prom-prometheus-operator-prometheus-0           Bound    pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d

I tried scaling the Kafka StatefulSet down to 0, then wait a long while, then scale back to 3, but they didn't recover.

Then I tried to scale all Deployments and StatefulSets down to 0, and do a same-version upgrade the K8S cluster. Unfortunately, because of a problem (reported here) with the VMAccessForLinux extension I installed on the VMSS (following this guide to update SSH credentials on the nodes), the upgrade failed, 2.5 hours later, and the cluster remained in a Failed state. Now all of the pods with PVCs got stuck in ContainerCreating. I tried adding a second nodepool successfully, but pods placed on the new nodes still reported the same error, so I removed the second nodepool and scaled down the first nodepool to 1. I then tried to reboot the node using the Azure portal and from within an SSH connection. They all fail because of the issue with the extesnion. I then tried to gradually scale down all StatefulSets (I had to uninstall the prometheus-operator helm since it insisted on scaling the alertmanager StatefulSet back up), and enable only the logging StatefulSets, as they are smaller. It didn't help.

After taking down all StatefulSets, when running kubectl get nodes --output json | jq '.items[].status.volumesInUse' I get null.

What you expected to happen:
Pods with PVCs should start normally, and if mounting fails, it should eventually (and somewhat quickly) retry and succeed.

How to reproduce it (as minimally and precisely as possible):

I have no idea. This happens randomly.
Up to now, we have worked around it by removing our PVCs, but I don't want to do this any more, I need a solution.

Anything else we need to know?:

This is similar to the following issues, reported on Kubernetes and AKS. All of them have been closed, but none with a real solution AFAIK.

I replaced the GUIDs to anonimize the logs, but I kept it so that GUIDs are kept distinct.

Environment:

  • Kubernetes version (use kubectl version): VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks)
  • Size of cluster (how many worker nodes are in the cluster?) 1 nodepool with 3 Standard_DS3_v2 instances.
  • General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.) Kafka, several dotnet core HTTP microservices, logging (FluentBit + ElasticSearch + Kibana stack), monitoring (prometheus + grafana).
  • Others:
cp-kafka:
  enabled: true
  brokers: 3
  persistence:
    enabled: true
    size: 200Gi
    storageClass: ~
    disksPerBroker: 1

  configurationOverrides:
   "auto.create.topics.enable": "true"
   "num.partitions": "10"
   "log.retention.bytes": "180000000000"
    • Kafka PVC YAML (kubectl get pvc xxx --output json):
{
    "apiVersion": "v1",
    "kind": "PersistentVolumeClaim",
    "metadata": {
        "annotations": {
            "pv.kubernetes.io/bind-completed": "yes",
            "pv.kubernetes.io/bound-by-controller": "yes",
            "volume.beta.kubernetes.io/storage-provisioner": "kubernetes.io/azure-disk"
        },
        "creationTimestamp": "2019-10-13T12:00:00Z",
        "finalizers": [
            "kubernetes.io/pvc-protection"
        ],
        "labels": {
            "app": "cp-kafka",
            "release": "kafka"
        },
        "name": "datadir-0-kafka-cp-kafka-0",
        "namespace": "default",
        "resourceVersion": "3241128",
        "selfLink": "/api/v1/namespaces/default/persistentvolumeclaims/datadir-0-kafka-cp-kafka-0",
        "uid": "kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "spec": {
        "accessModes": [
            "ReadWriteOnce"
        ],
        "resources": {
            "requests": {
                "storage": "200Gi"
            }
        },
        "storageClassName": "default",
        "volumeMode": "Filesystem",
        "volumeName": "pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "status": {
        "accessModes": [
            "ReadWriteOnce"
        ],
        "capacity": {
            "storage": "200Gi"
        },
        "phase": "Bound"
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions