Description
openedon Oct 17, 2019
What happened:
Pods with PVCs are stuck in ContainerCreating
state, due to a problem with attachment/mounting.
I am using a VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks) cluster. Following a crash for the Kafka pods (using Confluent helm charts v5.3.1; see configuration below, under Environment), 2 of the 3 got stuck in the ContainerCreating state. The dashboard seems to show that all the PVCs are failing to mount because of one volume that has not been detached properly:
kafka-cp-kafka
Unable to mount volumes for pod "kafka-cp-kafka-0_default(kafkapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-kafka-0". list of unmounted volumes=[datadir-0]. list of unattached volumes=[datadir-0 jmx-config default-token-xxxcc]
kafka-cp-zookeeper
AttachVolume.Attach failed for volume "pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "kafka-cp-zookeeper-0_default(zookepod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "default"/"kafka-cp-zookeeper-0". list of unmounted volumes=[datadir datalogdir]. list of unattached volumes=[datadir datalogdir jmx-config default-token-xxxcc]
es-data-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-data-efk-logging-cluster-default-0_logging(eslogdpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-data-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]
es-master-efk-logging-cluster-default
AttachVolume.Attach failed for volume "pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "es-master-efk-logging-cluster-default-0_logging(eslogmpod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-master-efk-logging-cluster-default-0". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data default-token-xxxdd]
prometheus-prom-prometheus-operator-prometheus
AttachVolume.Attach failed for volume "pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "prometheus-prom-prometheus-operator-prometheus-0_monitoring(promppod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"prometheus-prom-prometheus-operator-prometheus-0". list of unmounted volumes=[prometheus-prom-prometheus-operator-prometheus-db]. list of unattached volumes=[prometheus-prom-prometheus-operator-prometheus-db config config-out prometheus-prom-prometheus-operator-prometheus-rulefiles-0 prom-prometheus-operator-prometheus-token-xxxee]
alertmanager-prom-prometheus-operator-alertmanager
AttachVolume.Attach failed for volume "pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" : Attach volume "kubernetes-dynamic-pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx" to instance "aks-nodepool1-00011122-vmss000000" failed with compute.VirtualMachineScaleSetVMsClient#Update: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="AttachDiskWhileBeingDetached" Message="Cannot attach data disk 'diskfail-guid-xxxx-xxxx-xxxxxxxxxxxx' to VM 'aks-nodepool1-00011122-vmss_0' because the disk is currently being detached or the last detach operation failed. Please wait until the disk is completely detached and then try again or delete/detach the disk explicitly again."
Unable to mount volumes for pod "alertmanager-prom-prometheus-operator-alertmanager-0_monitoring(promapod-guid-xxxx-xxxx-xxxxxxxxxxxx)": timeout expired waiting for volumes to attach or mount for pod "monitoring"/"alertmanager-prom-prometheus-operator-alertmanager-0". list of unmounted volumes=[alertmanager-prom-prometheus-operator-alertmanager-db]. list of unattached volumes=[alertmanager-prom-prometheus-operator-alertmanager-db config-volume prom-prometheus-operator-alertmanager-token-xxxff]
Running kubectl get pvc
shows the PVC in Bound state (full YAML-JSON from Dashboard below in Environment):
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
default datadir-0-kafka-cp-kafka-0 Bound pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx 200Gi RWO default 3d16h
default datadir-0-kafka-cp-kafka-1 Bound pvc-kafkad02-guid-xxxx-xxxx-xxxxxxxxxxxx 200Gi RWO default 3d16h
default datadir-0-kafka-cp-kafka-2 Bound pvc-kafkad03-guid-xxxx-xxxx-xxxxxxxxxxxx 200Gi RWO default 3d16h
default datadir-kafka-cp-zookeeper-0 Bound pvc-zookepvc-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
default datadir-kafka-cp-zookeeper-1 Bound pvc-zooked02-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
default datadir-kafka-cp-zookeeper-2 Bound pvc-zooked03-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
default datalogdir-kafka-cp-zookeeper-0 Bound pvc-zookel01-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
default datalogdir-kafka-cp-zookeeper-1 Bound pvc-zookel02-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
default datalogdir-kafka-cp-zookeeper-2 Bound pvc-zookel03-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 3d16h
logging es-data-es-data-efk-logging-cluster-default-0 Bound pvc-eslogdpvc-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 10d
logging es-data-es-master-efk-logging-cluster-default-0 Bound pvc-eslogmpvc-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 10d
monitoring alertmanager-prom-prometheus-operator-alertmanager-db-alertmanager-prom-prometheus-operator-alertmanager-0 Bound pvc-promapvc-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 10d
monitoring prom-grafana Bound pvc-grafad01-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 10d
monitoring prometheus-prom-prometheus-operator-prometheus-db-prometheus-prom-prometheus-operator-prometheus-0 Bound pvc-promppvc-guid-xxxx-xxxx-xxxxxxxxxxxx 10Gi RWO default 10d
I tried scaling the Kafka StatefulSet
down to 0, then wait a long while, then scale back to 3, but they didn't recover.
Then I tried to scale all Deployments
and StatefulSets
down to 0, and do a same-version upgrade the K8S cluster. Unfortunately, because of a problem (reported here) with the VMAccessForLinux
extension I installed on the VMSS (following this guide to update SSH credentials on the nodes), the upgrade failed, 2.5 hours later, and the cluster remained in a Failed state. Now all of the pods with PVCs got stuck in ContainerCreating
. I tried adding a second nodepool successfully, but pods placed on the new nodes still reported the same error, so I removed the second nodepool and scaled down the first nodepool to 1. I then tried to reboot the node using the Azure portal and from within an SSH connection. They all fail because of the issue with the extesnion. I then tried to gradually scale down all StatefulSets
(I had to uninstall the prometheus-operator helm since it insisted on scaling the alertmanager
StatefulSet
back up), and enable only the logging StatefulSets
, as they are smaller. It didn't help.
After taking down all StatefulSets
, when running kubectl get nodes --output json | jq '.items[].status.volumesInUse'
I get null
.
What you expected to happen:
Pods with PVCs should start normally, and if mounting fails, it should eventually (and somewhat quickly) retry and succeed.
How to reproduce it (as minimally and precisely as possible):
I have no idea. This happens randomly.
Up to now, we have worked around it by removing our PVCs, but I don't want to do this any more, I need a solution.
Anything else we need to know?:
This is similar to the following issues, reported on Kubernetes and AKS. All of them have been closed, but none with a real solution AFAIK.
- Timeout expired waiting for volumes to attach kubernetes/kubernetes#67014
- Unable to mount volumes for pod: timeout expired waiting for volumes to attach or mount kubernetes/kubernetes#65500
- Persistent Volume failing to mount - timeout - AKS kubernetes/kubernetes#75548
- Trouble attaching volume #884
- the disk is currently being detached or the last detach operation failed #615
- 'timeout expired waiting for volumes to attach/mount for pod when cluster' when node-vm-size is Standard_B1s #166
I replaced the GUIDs to anonimize the logs, but I kept it so that GUIDs are kept distinct.
Environment:
- Kubernetes version (use
kubectl version
): VMSS-backed westus-located K8S (1.14.6; aksEngineVersion : v0.40.2-aks) - Size of cluster (how many worker nodes are in the cluster?) 1 nodepool with 3 Standard_DS3_v2 instances.
- General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.) Kafka, several dotnet core HTTP microservices, logging (FluentBit + ElasticSearch + Kibana stack), monitoring (prometheus + grafana).
- Others:
- Kafka helm configuration (using Confluent helm charts v5.3.1):
cp-kafka:
enabled: true
brokers: 3
persistence:
enabled: true
size: 200Gi
storageClass: ~
disksPerBroker: 1
configurationOverrides:
"auto.create.topics.enable": "true"
"num.partitions": "10"
"log.retention.bytes": "180000000000"
-
- Kafka PVC YAML (
kubectl get pvc xxx --output json
):
- Kafka PVC YAML (
{
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"metadata": {
"annotations": {
"pv.kubernetes.io/bind-completed": "yes",
"pv.kubernetes.io/bound-by-controller": "yes",
"volume.beta.kubernetes.io/storage-provisioner": "kubernetes.io/azure-disk"
},
"creationTimestamp": "2019-10-13T12:00:00Z",
"finalizers": [
"kubernetes.io/pvc-protection"
],
"labels": {
"app": "cp-kafka",
"release": "kafka"
},
"name": "datadir-0-kafka-cp-kafka-0",
"namespace": "default",
"resourceVersion": "3241128",
"selfLink": "/api/v1/namespaces/default/persistentvolumeclaims/datadir-0-kafka-cp-kafka-0",
"uid": "kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "200Gi"
}
},
"storageClassName": "default",
"volumeMode": "Filesystem",
"volumeName": "pvc-kafkad01-guid-xxxx-xxxx-xxxxxxxxxxxx"
},
"status": {
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "200Gi"
},
"phase": "Bound"
}
}