Description
Describe the bug
When using both topologyConstrainedPools
and volumeNamePrefix
the rbd image failed to mount, claiming "image not found"; the image was created but in the csi-rbdplugin
pod it was using the csi-vol-
prefix instead of the prefix I specified in my storageClass
Environment details
- Image/version of Ceph CSI driver : v3.8.0
- Helm chart version : rook v1.11.4
- Kernel version :
5.15.0-69-generic
- Mounter used for mounting PVC (for cephFS its
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : default for rook, I believekrbd
- Kubernetes cluster version :
1.26.1
- Ceph cluster version :
v17.2.6
Steps to reproduce
Steps to reproduce the behavior:
Create a storageClass using topologyConstrainedPools
and a custom volumeNamePrefix
; I don't know if it was at all specific to my cluster or something generic. I also had stripeCount, stripeUnit, encrypted, and encryptionKMSID specified, in case that matters.
This is what I used, skipping anything that isn't at least slightly specific to my case:
imageFeatures: layering,exclusive-lock,object-map,fast-diff
stripeCount: "2"
stripeUnit: "262144"
encrypted: "true"
encryptionKMSID: pvc-encrypt-metadata-global
topologyConstrainedPools: |
[ {
"poolName":"nonresilient-nvme-beef1",
"domainSegments":[
{"domainLabel":"hostname","value":"beef1"}
]
}, {
"poolName":"nonresilient-nvme-beef3",
"domainSegments":[
{"domainLabel":"hostname","value":"beef3"}
]
}, {
"poolName":"nonresilient-nvme-pork1",
"domainSegments":[
{"domainLabel":"hostname","value":"pork1"}
]
}, {
"poolName":"nonresilient-nvme-pork2",
"domainSegments":[
{"domainLabel":"hostname","value":"pork2"}
]
}, {
"poolName":"nonresilient-nvme-valendar",
"domainSegments":[
{"domainLabel":"hostname","value":"valendar"}
]
}, {
"poolName":"nonresilient-nvme-calgarius",
"domainSegments":[
{"domainLabel":"hostname","value":"calgarius"}
]
} ]
Actual results
The pod trying to mount had this status in kubectl describe
:
Normal SuccessfulAttachVolume 2m27s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-d35df4e1-1565-4617-bc55-3511b71c8dff"
Warning FailedMount 24s kubelet Unable to attach or mount volumes: unmounted volumes=[pgdata], unattached volumes=[pgdata dshm kube-api-access-prq6w]: timed out waiting for the condition
Warning FailedMount 10s (x9 over 2m18s) kubelet MountVolume.MountDevice failed for volume "pvc-d35df4e1-1565-4617-bc55-3511b71c8dff" : rpc error: code = Internal desc = error generating volume 0001-0009-roo
The following message was repeated in the csi-rbdplugin
log messages:
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:40:41.241131 2742545 rbd_journal.go:688] ID: 411 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 failed to get image id nonresilient-nvme-calgarius/csi-vol-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:40:41.241284 2742545 nodeserver.go:194] ID: 411 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 error generating volume 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:40:41.241368 2742545 utils.go:210] ID: 411 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 GRPC error: rpc error: code = Internal desc = error generating volume 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:41:05.960441 2742545 rbd_journal.go:688] ID: 415 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 failed to get image id nonresilient-nvme-calgarius/csi-vol-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:41:05.960581 2742545 nodeserver.go:194] ID: 415 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 error generating volume 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
csi-rbdplugin-l4ql6 csi-rbdplugin E0421 17:41:05.960655 2742545 utils.go:210] ID: 415 Req-ID: 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5 GRPC error: rpc error: code = Internal desc = error generating volume 0001-0009-rook-ceph-0000000000000057-00f94865-97fd-4d5c-bac3-63c6b65cd1a5: image not found: RBD image not found
The image local-encrypt00f94865-97fd-4d5c-bac3-63c6b65cd1a5
was correctly created, but the logs show it was trying to get the image id for csi-vol-00f94865-97fd-4d5c-bac3-63c6b65cd1a5
The PersistentVolume was correctly created in kubernetes and correctly references imageName=local-encrypt00f94865-97fd-4d5c-bac3-63c6b65cd1a5
Image Name: local-encrypt00f94865-97fd-4d5c-bac3-63c6b65cd1a5
Name used for lookup: csi-vol-00f94865-97fd-4d5c-bac3-63c6b65cd1a5
Additional context
I've been trying to set up topology-specific non-resilient pods so that I can have data stored on the same node as the pod using it in cases when I don't need replication and just want to be able to use relatively fast storage but already gave all my spare storage to ceph =]