Skip to content

On transient failure in velero csi plugin, the volumesnapshot is getting deleted without updating the object store #8116

Open
@soumyapattnaik

Description

What steps did you take and what happened:
In the finalizing phase today, we do a get on volumesnapshot, if it fails due to some transient failures like TLS handshake timeout, velero csi plugin deletes the volumesnapshot and volumesnapshotcontent.

https://github.com/vmware-tanzu/velero-plugin-for-csi/blob/e8f7af4b65f0ed6c69d340aefe2257dc25cd013f/internal/backup/volumesnapshot_action.go#L104

Post delete the backup controller re uploads the backup TarBall.

func buildFinalTarball(tr *tar.Reader, tw *tar.Writer, updateFiles map[string]FileForArchive) error {

But it does not update CSI related artifacts in the object store.

Because of which there is mismatch between what is there in object store and what is actually backed up.

This has led to other issue in velero- #7979

What did you expect to happen:

The expectation is if the snapshot is cleaned up then the corresponding entry should also be removed from object store. Also for transient errors we should have a retry mechanism in velero to retry the get operation atleast and not fail the operation upfront.

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

  • kubectl logs deployment/velero -n velero
  • velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
  • velero backup logs <backupname>
  • velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
  • velero restore logs <restorename>

Anything else you would like to add:

Environment:

  • Velero version (use velero version):
  • Velero features (use velero client config get features):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions