This step is not applicable if you are using managed CSI driver on AKS.
- find csi driver controller pod
There could be multiple controller pods (only one pod is the leader), if there are no helpful logs, try to get logs from the leader controller pod.
kubectl get po -o wide -n kube-system | grep csi-blob-controller
NAME READY STATUS RESTARTS AGE IP NODE csi-blob-controller-56bfddd689-dh5tk 4/4 Running 0 35s 10.240.0.19 k8s-agentpool-22533604-0 csi-blob-controller-56bfddd689-sl4ll 4/4 Running 0 35s 10.240.0.23 k8s-agentpool-22533604-1
- get pod description and logs
kubectl describe pod csi-blob-controller-56bfddd689-dh5tk -n kube-system > csi-blob-controller-description.log
kubectl logs csi-blob-controller-56bfddd689-dh5tk -c blob -n kube-system > csi-blob-controller.log
- locate csi driver pod and make sure which pod does the actual volume mount/unmount
kubectl get po -o wide -n kube-system | grep csi-blob-node
NAME READY STATUS RESTARTS AGE IP NODE csi-blob-node-cvgbs 3/3 Running 0 7m4s 10.240.0.35 k8s-agentpool-22533604-1 csi-blob-node-dr4s4 3/3 Running 0 7m4s 10.240.0.4 k8s-agentpool-22533604-0
- get pod description and logs
kubectl describe pod csi-blob-node-cvgbs -n kube-system > csi-blob-node-description.log
kubectl logs csi-blob-node-cvgbs -c blob -n kube-system > csi-blob-node.log
note: to watch logs in realtime from multiple
csi-blob-node
DaemonSet pods simultaneously, run the command:kubectl logs daemonset/csi-blob-node -c blob -n kube-system -f
get blobfuse-proxy logs on the node
journalctl -u blobfuse-proxy -l
note: if there are no logs for blobfuse-proxy, you can check the status of the blobfuse-proxy service by running the command
systemctl status blobfuse-proxy
.
- check blobfuse mount inside driver
kubectl exec -it csi-blob-node-9vl9t -c blob -n kube-system -- mount | grep blobfuse
blobfuse on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-efce16db-bf15-4634-b82b-068385019d7c/globalmount type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) blobfuse on /var/lib/kubelet/pods/e73d0984-a253-4203-9e8c-9237ae5c55d5/volumes/kubernetes.io~csi/pvc-efce16db-bf15-4634-b82b-068385019d7c/mount type fuse (rw,relatime,user_id=0,group_id=0,allow_other)
- check nfs mount inside driver
kubectl exec -it csi-blob-node-9vl9t -n kube-system -c blob -- mount | grep nfs
accountname.file.core.windows.net:/accountname/pvcn-46c357b2-333b-4c42-8a7f-2133023d6c48 on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-46c357b2-333b-4c42-8a7f-2133023d6c48/globalmount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.244.0.6,local_lock=none,addr=20.150.29.168) accountname.file.core.windows.net:/accountname/pvcn-46c357b2-333b-4c42-8a7f-2133023d6c48 on /var/lib/kubelet/pods/7994e352-a4ee-4750-8cb4-db4fcf48543e/volumes/kubernetes.io~csi/pvc-46c357b2-333b-4c42-8a7f-2133023d6c48/mount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.244.0.6,local_lock=none,addr=20.150.29.168)
- update controller deployment
kubectl edit deployment csi-blob-controller -n kube-system
- update daemonset deployment
kubectl edit ds csi-blob-node -n kube-system
change below deployment config, e.g.
image: mcr.microsoft.com/k8s/csi/blob-csi:v1.4.0
imagePullPolicy: Always
blobfuse2 -v
blobfuse2 version 2.1.2
mount | grep blobfuse | uniq
- Troubleshooting blobfuse mount failure on the agent node
- collect logs
/var/log/message
if there is blobfuse mount failure, refer to blobfuse driver troubleshooting
- collect logs
- blobfuse
To check if blobfuse mount would work on the agent node, run the following commands to verify that the storage account name, key, and container name are correct. If any of these are incorrect, the blobfuse mount will fail:
mkdir test
export AZURE_STORAGE_ACCOUNT=
export AZURE_STORAGE_ACCESS_KEY=
# only for sovereign cloud
# export AZURE_STORAGE_BLOB_ENDPOINT=accountname.blob.core.chinacloudapi.cn
blobfuse2 test --container-name=CONTAINER-NAME --tmp-path=/tmp/blobfuse -o allow_other --file-cache-timeout-in-seconds=120
You can find more detailed information about environment variables at https://github.com/Azure/azure-storage-fuse#environment-variables.
- NFSv3
mkdir /tmp/test
mount -v -t nfs -o sec=sys,vers=3,nolock accountname.blob.core.windows.net:/accountname/container-name /tmp/test
Get client-side logs on AKS Linux node if there is mount error
# get ama-logs pod which is running on the AKS Linux node
kubectl get po -n kube-system -o wide | grep ama-logs
# get blobfuse2 logs
kubectl -n kube-system cp ama-logs-xxxx:/var/log/blobfuse2.log /tmp/blobfuse2.log
Get client-side logs on Linux node if there is mount error
kubectl debug node/node-name --image=nginx
# get blobfuse2 logs
kubectl cp node-debugger-node-name-xxxx:/host/var/log/blobfuse2.log /tmp/blobfuse2.log
# after the logs have been collected, you can delete the debug pod
kubectl delete po node-debugger-node-name-xxxx
Supported from v1.22.2 About aznfs mount helper: https://github.com/Azure/AZNFS-mount/
Check mount point information
kubectl debug node/node-name --image=nginx
findmnt -t nfs
The SOURCE
of the mount point should have prefix with an ip address rather than domain name. e.g, 10.161.100.100:/nfs02a796c105814dbebc4e/pvc-ca149059-6872-4d6f-a806-48402648110c.
Get client-side logs on Linux node
kubectl debug node/node-name --image=nginx
cat /opt/microsoft/aznfs/data/aznfs.log
If ip was migrated successfully, you should find logs like:
IP for nfsxxxxx.blob.core.windows.net changed [1.2.3.4 -> 5.6.7.8].
Updating mountmap entry [nfsxxxxx.blob.core.windows.net 10.161.100.100 1.2.3.4 -> nfsxxxxx.blob.core.windows.net 10.161.100.100 5.6.7.8]