Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get error when creating volume snapshot #300

Closed
igorgonibm opened this issue Apr 21, 2020 · 21 comments
Closed

Get error when creating volume snapshot #300

igorgonibm opened this issue Apr 21, 2020 · 21 comments
Assignees
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@igorgonibm
Copy link

When creating snapshot I get the following messages in csi-snapshotter logs:
Full log, snap content describe and relevant yamls attached

I0421 09:23:09.254741 1 snapshot_controller.go:606] setAnnVolumeSnapshotBeingCreated: set annotation [snapshot.storage.kubernetes.io/volumesnapshot-being-created:yes] on content [snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762].
I0421 09:23:09.259302 1 snapshot_controller.go:179] updateContentStatusWithEvent[snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762]
I0421 09:23:09.263085 1 snapshot_controller.go:200] updating VolumeSnapshotContent[snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762] error status failed Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io "snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762": the object has been modified; please apply your changes to the latest version and try again
E0421 09:23:09.263110 1 snapshot_controller.go:139] createSnapshot [create-snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762]: error occurred in createSnapshotWrapper: failed to add VolumeSnapshotBeingCreated annotation on the content snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762: "snapshot controller failed to update snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762 on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io "snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762": the object has been modified; please apply your changes to the latest version and try again"
E0421 09:23:09.263208 1 goroutinemap.go:150] Operation for "create-snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762" failed. No retries permitted until 2020-04-21 09:23:09.763165038 +0000 UTC m=+6.819361135 (durationBeforeRetry 500ms). Error: "failed to add VolumeSnapshotBeingCreated annotation on the content snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762: "snapshot controller failed to update snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762 on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io \"snapcontent-3f3c2bc8-2305-4b26-95f6-0e641e8e3762\": the object has been modified; please apply your changes to the latest version and try again""
yamls_and_log.zip

@xing-yang
Copy link
Collaborator

I see the following error message in your logs. It looks like you have not created the secret but you have that secret referenced in the snapshot class.

E0421 09:29:03.988544       1 snapshot_controller.go:535] Failed to get credentials for snapshot snapcontent-c62126d0-15af-4c45-a950-99e056a12604: error getting secret secret-67a in namespace kube-system: secrets "secret-67a" not found

@igorgonibm
Copy link
Author

igorgonibm commented Apr 21, 2020

@xing-yang I used a wrong secret at first but then fixed to use the right one.
This error happened after the fix.
(As I specified snapshot was created on storage. If secret was wrong it would not happen)

@igorgonibm
Copy link
Author

igorgonibm commented Apr 22, 2020

@xing-yang I forget to mention that I am in team which develop CSI driver and I am adding snapshots support. Any idea what could be wrong? Its really blocking us prom proceeding. Thank you!

@xing-yang
Copy link
Collaborator

xing-yang commented Apr 22, 2020

@igorgonibm what image versions of snapshot-controller and csi-snapshotter sidecar are you using?

Can you provide "kubectl describe volumesnapshot"? Those error messages you posted are failures updating objects in the API server but the retries were successful later as VolumeSnapshotContent has its status updated and ReadyToUse is set true. I don't know what is the status of VolumeSnapshot. Can you set log level to 5 and reproduce the logs? I can't find any messages on updating the VoumeSnapshot status in the existing logs.

@igorgonibm
Copy link
Author

igorgonibm commented Apr 22, 2020

I created new snapshot and attached relevant yamls. describe of class, snapshot and snapshotcomntent and snapshotter log
Log level is set to 5
thank you
logs.zip

@xing-yang
Copy link
Collaborator

According to "kubectl describe volumesnapshot", the snapshot is created successfully.

Name:         snap20
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1beta1
Kind:         VolumeSnapshot
Metadata:
  Creation Timestamp:  2020-04-22T14:07:52Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
    snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  Generation:        1
  Resource Version:  2052306
  Self Link:         /apis/snapshot.storage.k8s.io/v1beta1/namespaces/default/volumesnapshots/snap20
  UID:               b52df980-4158-4af1-b0ad-bffa7d1a2937
Spec:
  Source:
    Persistent Volume Claim Name:  pvc-snapshot
  Volume Snapshot Class Name:      snapclass
Status:
  Bound Volume Snapshot Content Name:  snapcontent-b52df980-4158-4af1-b0ad-bffa7d1a2937
  Creation Time:                       2020-04-22T14:08:03Z
  Ready To Use:                        true
  Restore Size:                        1Gi
Events:                                <none>

Ready To Use is true. Creation time, restore size, and bound volume snapshot content name are all set in the VolumeSnapshot status.

@igorgonibm
Copy link
Author

@xing-yang so why is there error message in the log? It is related to the current snap content.

E0422 14:08:03.234487 1 snapshot_controller_base.go:250] could not sync content "snapcontent-b52df980-4158-4af1-b0ad-bffa7d1a2937": failed to remove VolumeSnapshotBeingCreated annotation from the content snapcontent-b52df980-4158-4af1-b0ad-bffa7d1a2937: "snapshot controller failed to update snapcontent-b52df980-4158-4af1-b0ad-bffa7d1a2937 on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io "snapcontent-b52df980-4158-4af1-b0ad-bffa7d1a2937": the object has been modified; please apply your changes to the latest version and try again"

@xing-yang
Copy link
Collaborator

This is because VolumeSnapshotBeingCreated annotation was already removed earlier. This syncContent call has an older version of the object so it failed to remove it again, but it is actually already removed.

I'll take a look to see how to avoid this error.

@xing-yang
Copy link
Collaborator

/assign @xing-yang

@igorgonibm
Copy link
Author

This is because VolumeSnapshotBeingCreated annotation was already removed earlier. This syncContent call has an older version of the object so it failed to remove it again, but it is actually already removed.

I'll take a look to see how to avoid this error.

Thank you. Will wait for your answer.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 25, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 24, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bhargavkeshav
Copy link

Hi @xing-yang,

I am facing similar issue , was this issue fixed.

E0927 04:04:34.862225       1 snapshot_controller_base.go:405] could not sync content "snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c": snapshot controller failed to update snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io "snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c": the object has been modified; please apply your changes to the latest version and try again
I0927 04:04:34.862272       1 snapshot_controller_base.go:272] Failed to sync content "snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c", will retry again: snapshot controller failed to update snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io "snapcontent-1d5d9c1a-8824-4087-b2a3-a824d00fb60c": the object has been modified; please apply your changes to the latest version and try again
E0927 04:04:35.857893       1 snapshot_controller_base.go:379] could not sync snapshot "mission-control/velero-mission-control-data-mission-control-2-ztc2t": snapshot controller failed to update velero-mission-control-data-mission-control-2-ztc2t on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-mission-control-data-mission-control-2-ztc2t": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/mission-control/velero-mission-control-data-mission-control-2-ztc2t, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 42725590-e58f-40a0-b3c2-5b84f4dff2a0, UID in object meta: 
I0927 04:04:35.857944       1 snapshot_controller_base.go:197] Failed to sync snapshot "mission-control/velero-mission-control-data-mission-control-2-ztc2t", will retry again: snapshot controller failed to update velero-mission-control-data-mission-control-2-ztc2t on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-mission-control-data-mission-control-2-ztc2t": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/mission-control/velero-mission-control-data-mission-control-2-ztc2t, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 42725590-e58f-40a0-b3c2-5b84f4dff2a0, UID in object meta: 
E0927 04:04:36.056089       1 snapshot_controller_base.go:379] could not sync snapshot "distribution/velero-redis-data-distribution-1-w76fr": snapshot controller failed to update velero-redis-data-distribution-1-w76fr on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-redis-data-distribution-1-w76fr": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/distribution/velero-redis-data-distribution-1-w76fr, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: b9ce0d3a-f1c8-448d-a40c-a1ba6a1366c1, UID in object meta: 
I0927 04:04:36.056139       1 snapshot_controller_base.go:197] Failed to sync snapshot "distribution/velero-redis-data-distribution-1-w76fr", will retry again: snapshot controller failed to update velero-redis-data-distribution-1-w76fr on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-redis-data-distribution-1-w76fr": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/distribution/velero-redis-data-distribution-1-w76fr, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: b9ce0d3a-f1c8-448d-a40c-a1ba6a1366c1, UID in object meta: 
E0927 04:04:36.455871       1 snapshot_controller_base.go:379] could not sync snapshot "artifactory-ha/velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh": snapshot controller failed to update velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/artifactory-ha/velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 1d5d9c1a-8824-4087-b2a3-a824d00fb60c, UID in object meta: 
I0927 04:04:36.455925       1 snapshot_controller_base.go:197] Failed to sync snapshot "artifactory-ha/velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh", will retry again: snapshot controller failed to update velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh on API server: Operation cannot be fulfilled on volumesnapshots.snapshot.storage.k8s.io "velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh": StorageError: invalid object, Code: 4, Key: /registry/snapshot.storage.k8s.io/volumesnapshots/artifactory-ha/velero-volume-artifactory-ha-artifactory-ha-primary-1-2mlsh, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 1d5d9c1a-8824-4087-b2a3-a824d00fb60c, UID in object meta:

@Ahmad-Faizan
Copy link

@bhargavkeshav were you able to fix the issue?
I am facing the same issue and have similar logs

@Ahmad-Faizan
Copy link

this seems to be related issue kubernetes-sigs/controller-runtime#1881

@RamyAllam
Copy link

I can see the same issue as well on GKE v1.25.8-gke.1000

# kubectl get volumesnapshot/test01-cqlzf -n mysql01-db-8moti --cluster gke_CLUSTER_CONTEXT -o yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  creationTimestamp: "2023-07-18T15:37:44Z"
  finalizers:
  - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
  - snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
  generation: 1
  name: test01-cqlzf
  namespace: mysql01-db-8moti
  resourceVersion: "225004131"
  uid: bd15ce02-8aef-4c3c-8d45-0acb92a5800c
spec:
  source:
    persistentVolumeClaimName: data-mysql01-db-8moti-mysql-0
  volumeSnapshotClassName: pd-snapshot
status:
  boundVolumeSnapshotContentName: snapcontent-bd15ce02-8aef-4c3c-8d45-0acb92a5800c
  creationTime: "2023-07-18T15:37:46Z"
  readyToUse: true
  restoreSize: 5Gi

Error

Failed to create snapshot: failed to add VolumeSnapshotBeingCreated annotation on the content snapcontent-bd15ce02-8aef-4c3c-8d45-0acb92a5800c: "snapshot controller failed to update snapcontent-bd15ce02-8aef-4c3c-8d45-0acb92a5800c on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io \"snapcontent-bd15ce02-8aef-4c3c-8d45-0acb92a5800c\": the object has been modified; please apply your changes to the latest version and try again"

@luckymrwang
Copy link

Hi @xing-yang

I am facing similar issue , was this issue fixed ?

@xing-yang
Copy link
Collaborator

There's a WIP PR that may fix this problem: #876

@luckymrwang
Copy link

There's a WIP PR that may fix this problem: #876

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

8 participants