Skip to content

[BUG] VolumeSnapshot takes very long to become readyToUse #4555

Open

Description

Describe the bug

When we create a new VolumeSnapshot in our clusters it takes at least 8 minutes before the VolumeSnapshot readyToUse field is set to true. When we check the snapshot in the Azure portal it is created immediately and it looks like just the update of the Kubernetes resource takes so long.

This causes problems with Velero (our backup solution) because it has to wait until the VolumeSnapshot becomes readyToUse, before it proceeds with the next backup. With a lot of Volumes the backups are failing then.

We see this issue since the beginning of September.

To Reproduce

  1. Create a new VolumeSnapshot:
---
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: test-snapshot
spec:
  volumeSnapshotClassName: csi-azuredisk-vsc
  source:
    persistentVolumeClaimName: pgdata-alarm-postgres-db-0
  1. Watch the created VolumeSnapshot: k get volumesnapshots.snapshot.storage.k8s.io -w
NAME            READYTOUSE   SOURCEPVC                    SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS       SNAPSHOTCONTENT   CREATIONTIME   AGE
test-snapshot                pgdata-alarm-postgres-db-0                                         csi-azuredisk-vsc                                    0s
test-snapshot                pgdata-alarm-postgres-db-0                                         csi-azuredisk-vsc                                    0s
test-snapshot   false        pgdata-alarm-postgres-db-0                                         csi-azuredisk-vsc   snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb                  0s
test-snapshot   true         pgdata-alarm-postgres-db-0                           1Gi           csi-azuredisk-vsc   snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb   8m23s          8m24s
  1. Watch the created VolumeSnapshotContent k get volumesnapshotcontents.snapshot.storage.k8s.io snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb -w
NAME                                               READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER               VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT   VOLUMESNAPSHOTNAMESPACE   AGE
snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb                              Retain           disk.csi.azure.com   csi-azuredisk-vsc     test-snapshot    alarm                     29s
snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb   true         1073741824    Retain           disk.csi.azure.com   csi-azuredisk-vsc     test-snapshot    alarm                     8m24s
snapcontent-ae19d3d9-aafd-42d8-bee6-6159fc2648cb   true         1073741824    Retain           disk.csi.azure.com   csi-azuredisk-vsc     test-snapshot    alarm                     8m24s
  1. The VolumeSnapshot and VolumeSnapshotContent is set to readyToUse after 8 minutes.

Expected behavior

VolumeSnapshot is updated faster, so that Velero can complete the backups.

Environment (please complete the following information):

  • CLI Version v1.31.0
  • Kubernetes version v1.29.6 and v1.30.3

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions