Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using kubernetes and nydus,container failed to start with an error “target snapshot already exists” #1527

Open
cl0udee opened this issue Dec 21, 2023 · 7 comments

Comments

@cl0udee
Copy link

cl0udee commented Dec 21, 2023

Additional Information

The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.

Version of nydus being used (nydusd --version)

Version: 	v2.1.1
Git Commit: 	2fd7070bf7c08ba8667a375ecf5ab4ca3963a184
Build Time: 	2022-11-06T11:14:20.450697142Z
Profile: 	release
Rustc: 		rustc 1.61.0 (fe5b13d68 2022-05-18)

Version of nydus-snapshotter being used (containerd-nydus-grpc --version)

Version:     v0.4.0
Revision:    1e18acbf9d39588d39d0276a423e33ebeeb3462b
Go version:  go1.18.6
Build time:  2022-11-30T11:40:06

Kernel information (uname -r)

4.19.90-2102.2.0.0062.ctl2.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

command result: cat /etc/os-release

containerd-nydus-grpc command line used, if applicable (ps aux | grep containerd-nydus-grpc)

/usr/bin/containerd-nydus-grpc --config-path /etc/nydus/nydusd-config.json --address /run/containerd/containerd-nydus-grpc.sock --nydusd-path /usr/bin/nydusd --nydusimg-path /usr/bin/nydus-image --log-to-stdout

client command line used, if applicable (such as: nerdctl, docker, kubectl, ctr)

kubectl apply -f test-pod.yaml

Screenshots (if applicable)

Details about issue

When I use nerdctl, such as

nerdctl --snapshotter nydus run --rm -it centos:v1 bash

the container can be successfully created and run normally.
However, when I switch to Kubernetes, it gives me the following error:

Warning  FailedCreatePodSandBox  <invalid>  kubelet  Failed to create pod sandbox: rpc error: code = AlreadyExists desc = failed to get sandbox image "xxx/pause:3.6": failed to pull image "xxx/pause:3.6": failed to pull and unpack image "xxx/pause:3.6": unable to prepare extraction snapshot: target snapshot "sha256:xxx": already exists

I suspect this issue is related to the pause image because nerdctl, which doesn't involve the pause image, can start successfully. I have also tried using

ctr -n k8s.io content fetch $pause-image-name

but it doesn't work.

This is very confusing for me, especially because I was able to successfully launch pods using Kubernetes a while ago. However, after some time has passed, it is no longer able to start.

@imeoer
Copy link
Collaborator

imeoer commented Dec 21, 2023

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

@imeoer
Copy link
Collaborator

imeoer commented Dec 21, 2023

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

Need to ensure the pod using the image has been deleted first.

@cl0udee
Copy link
Author

cl0udee commented Dec 21, 2023

Cloud you try ctr images delete xxx/pause:3.6 --sync ?

Need to ensure the pod using the image has been deleted first.

I have already deleted all the pause images, but I still receive the same error. target snapshot "sha256:xxx": already exists

@kinderyj
Copy link

Is this a work in progress (WIP)? I'm experiencing the same issue in kata 3.2.0. @imeoer

@imeoer
Copy link
Collaborator

imeoer commented Jun 18, 2024

This appears to be an inconsistency in containerd snapshot metadata, try the following commands:

ctr -n k8s.io content ls | grep sha256:xxx
ctr -n k8s.io content rm $blob_id

But we still don't have an way to reproduce it.

@YuxiJin-tobeyjin
Copy link

Way to reproduce:

  1. Config containerd to use nydus as its snapshotter
  2. Start nydus-snapshotter.service
  3. Run some pods to make sure everything is fine
  4. Delete '/var/lib/containerd-nydus'
  5. Run crictl rmi {{pause image}}
  6. Run crictl pull {{pause image}} will give this error

@imeoer any suggestions about how to recover ?

@imeoer
Copy link
Collaborator

imeoer commented Oct 15, 2024

@YuxiJin-tobeyjin In step 4, the nydus root directory contains a snapshot metadata file, deleting this directory will result in snapshot inconsistency between containerd metadata and nydus snapshotter.

The solution appears to be deleting the containerd containers, images, and the entire containerd root directory, or considering providing a tool to cleanup the inconsistent snapshot entries in containerd metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants