Disk attachment/mounting problems, all pods with PVCs stuck in ContainerCreating

**What happened**:
Pods with PVCs are stuck in `ContainerCreating` state, due to a problem with attachment/mounting. 

I am using a VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks) cluster. Following a crash for the Kafka pods (using [Confluent helm charts](https://github.com/confluentinc/cp-helm-charts) v5.3.1; see configuration below, under Environment), 2 of the 3 got stuck in the ContainerCreating state. The dashboard seems to show that all the PVCs are failing to mount because of one volume that has not been detached properly:

```
kafka-cp-kafka
Unable to mount volumes for pod "kafka-cp-kafka-0_default(kafkapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-kafka-0". list of unmounted volumes=[datadir-0]. list of unattached volumes=[datadir-0 jmx-config default-token-xxxcc]

kafka-cp-zookeeper
AttachVolume.Attach failed for volume "pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "kafka-cp-zookeeper-0_default(zookepod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-zookeeper-0". list of unmounted volumes=[datadir datalogdir]. list of unattached volumes=[datadir datalogdir jmx-config default-token-xxxcc]

es-data-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-data-efk-logging-cluster-default-0_logging(eslogdpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-data-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]

es-master-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-master-efk-logging-cluster-default-0_logging(eslogmpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-master-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]

prometheus-prom-prometheus-operator-prometheus
AttachVolume.Attach failed for volume "pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "prometheus-prom-prometheus-operator-prometheus-0_monitoring(promppod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"prometheus-prom-prometheus-operator-prometheus-0". list of unmounted volumes=[prometheus-prom-prometheus-operator-prometheus-db]. list of unattached volumes=[prometheus-prom-prometheus-operator-prometheus-db config config-out prometheus-prom-prometheus-operator-prometheus-rulefiles-0 prom-prometheus-operator-prometheus-token-xxxee]

alertmanager-prom-prometheus-operator-alertmanager
AttachVolume.Attach failed for volume "pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "alertmanager-prom-prometheus-operator-alertmanager-0_monitoring(promapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"alertmanager-prom-prometheus-operator-alertmanager-0". list of unmounted volumes=[alertmanager-prom-prometheus-operator-alertmanager-db]. list of unattached volumes=[alertmanager-prom-prometheus-operator-alertmanager-db config-volume prom-prometheus-operator-alertmanager-token-xxxff]

```

Running `kubectl get pvc` shows the PVC in Bound state (full YAML-JSON from Dashboard below in Environment):

```
NAMESPACE    NAME                                                                                                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
default      datadir-0-kafka-cp-kafka-0                                                                                   Bound    pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-0-kafka-cp-kafka-1                                                                                   Bound    pvc-kafkad02-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-0-kafka-cp-kafka-2                                                                                   Bound    pvc-kafkad03-guid-xxxx-xxxx-xxxxxxxxxxxx  200Gi      RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-0                                                                                 Bound    pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-1                                                                                 Bound    pvc-zooked02-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datadir-kafka-cp-zookeeper-2                                                                                 Bound    pvc-zooked03-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-0                                                                              Bound    pvc-zookel01-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-1                                                                              Bound    pvc-zookel02-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
default      datalogdir-kafka-cp-zookeeper-2                                                                              Bound    pvc-zookel03-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        3d16h
logging      es-data-es-data-efk-logging-cluster-default-0                                                                Bound    pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
logging      es-data-es-master-efk-logging-cluster-default-0                                                              Bound    pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   alertmanager-prom-prometheus-operator-alertmanager-db-alertmanager-prom-prometheus-operator-alertmanager-0   Bound    pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   prom-grafana                                                                                                 Bound    pvc-grafad01-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d
monitoring   prometheus-prom-prometheus-operator-prometheus-db-prometheus-prom-prometheus-operator-prometheus-0           Bound    pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx  10Gi       RWO            default        10d

```

I tried scaling the Kafka `StatefulSet` down to 0, then wait a long while, then scale back to 3, but they didn't recover. 

Then I tried to scale all `Deployments` and `StatefulSets` down to 0, and do a same-version upgrade the K8S cluster. Unfortunately, because of a problem (reported [here](https://github.com/Azure/azure-linux-extensions/issues/918)) with the `VMAccessForLinux` extension I installed on the VMSS ([following this guide to update SSH credentials on the nodes](https://docs.microsoft.com/en-us/azure/aks/ssh)), the upgrade failed, 2.5 hours later, and the cluster remained in a Failed state. Now *all* of the pods with PVCs got stuck in `ContainerCreating`. I tried adding a second nodepool successfully, but pods placed on the new nodes still reported the same error, so I removed the second nodepool and scaled down the first nodepool to 1. I then tried to reboot the node using the Azure portal and from within an SSH connection. They all fail because of the issue with the extesnion. I then tried to gradually scale down all `StatefulSets` (I had to uninstall the prometheus-operator helm since it insisted on scaling the `alertmanager` `StatefulSet` back up), and enable only the logging `StatefulSets`, as they are smaller. It didn't help.

After taking down all `StatefulSets`, when running `kubectl get nodes --output json | jq '.items[].status.volumesInUse'` I get `null`.


**What you expected to happen**:
Pods with PVCs should start normally, and if mounting fails, it should eventually (and somewhat quickly) retry and succeed.

**How to reproduce it (as minimally and precisely as possible)**:

I have no idea. This happens randomly.
Up to now, we have worked around it by removing our PVCs, but I don't want to do this any more, I need a solution.

**Anything else we need to know?**:

This is similar to the following issues, reported on Kubernetes and AKS. All of them have been closed, but none with a real solution AFAIK.

  - https://github.com/kubernetes/kubernetes/issues/67014
  - https://github.com/kubernetes/kubernetes/issues/65500
  - https://github.com/kubernetes/kubernetes/issues/75548
  - https://github.com/Azure/AKS/issues/884
  - https://github.com/Azure/AKS/issues/615
  - https://github.com/Azure/AKS/issues/166

I replaced the GUIDs to anonimize the logs, but I kept it so that GUIDs are kept distinct.

**Environment**:
- Kubernetes version (use `kubectl version`): VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks)
- Size of cluster (how many worker nodes are in the cluster?) 1 nodepool with 3 Standard_DS3_v2 instances.
- General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.) Kafka, several dotnet core HTTP microservices, logging (FluentBit + ElasticSearch + Kibana stack), monitoring (prometheus + grafana).
- Others:
  - Kafka helm configuration (using [Confluent helm charts](https://github.com/confluentinc/cp-helm-charts) v5.3.1):

```
cp-kafka:
  enabled: true
  brokers: 3
  persistence:
    enabled: true
    size: 200Gi
    storageClass: ~
    disksPerBroker: 1

  configurationOverrides:
   "auto.create.topics.enable": "true"
   "num.partitions": "10"
   "log.retention.bytes": "180000000000"
```

  - - Kafka PVC YAML (`kubectl get pvc xxx --output json`):

```
{
    "apiVersion": "v1",
    "kind": "PersistentVolumeClaim",
    "metadata": {
        "annotations": {
            "pv.kubernetes.io/bind-completed": "yes",
            "pv.kubernetes.io/bound-by-controller": "yes",
            "volume.beta.kubernetes.io/storage-provisioner": "kubernetes.io/azure-disk"
        },
        "creationTimestamp": "2019-10-13T12:00:00Z",
        "finalizers": [
            "kubernetes.io/pvc-protection"
        ],
        "labels": {
            "app": "cp-kafka",
            "release": "kafka"
        },
        "name": "datadir-0-kafka-cp-kafka-0",
        "namespace": "default",
        "resourceVersion": "3241128",
        "selfLink": "/api/v1/namespaces/default/persistentvolumeclaims/datadir-0-kafka-cp-kafka-0",
        "uid": "kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "spec": {
        "accessModes": [
            "ReadWriteOnce"
        ],
        "resources": {
            "requests": {
                "storage": "200Gi"
            }
        },
        "storageClassName": "default",
        "volumeMode": "Filesystem",
        "volumeName": "pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
    },
    "status": {
        "accessModes": [
            "ReadWriteOnce"
        ],
        "capacity": {
            "storage": "200Gi"
        },
        "phase": "Bound"
    }
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disk attachment/mounting problems, all pods with PVCs stuck in ContainerCreating #1278

kwikwag
openedon Oct 17, 2019

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disk attachment/mounting problems, all pods with PVCs stuck in ContainerCreating #1278

Description

kwikwagopenedon Oct 17, 2019

Metadata