Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3S can't pull multiarch images due tu failed to unpack image on snapshotter overlayfs: #1278

Closed
kamilgregorczyk opened this issue Jan 7, 2020 · 22 comments
Labels
kind/question No code change, just asking/answering a question status/stale
Milestone

Comments

@kamilgregorczyk
Copy link

For some reason, K3S seems to be failing to download the image that I build with docker buildx build.... https://hub.docker.com/repository/docker/uniqe15/event-sourced-bank/tags?page=1

I'm using k3s (k3s version v1.0.0 (18bd921))with containerd

➜  event-sourced-bank3 git:(master) ✗ kubectl --insecure-skip-tls-verify describe pod event-sourced-bank-77f5c8cc65-f66lj

Name:         event-sourced-bank-77f5c8cc65-f66lj
Namespace:    default
Priority:     0
Node:         worker1/192.168.0.202
Start Time:   Tue, 07 Jan 2020 10:55:45 +0000
Labels:       app=event-sourced-bank
              pod-template-hash=77f5c8cc65
Annotations:  <none>
Status:       Pending
IP:           10.42.1.76
IPs:
  IP:           10.42.1.76
Controlled By:  ReplicaSet/event-sourced-bank-77f5c8cc65
Containers:
  event-sourced-bank:
    Container ID:   
    Image:          uniqe15/event-sourced-bank:latest
    Image ID:       
    Port:           8000/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rb494 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-rb494:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rb494
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  <unknown>             default-scheduler  Successfully assigned default/event-sourced-bank-77f5c8cc65-f66lj to worker1
  Normal   Pulling    3m40s (x4 over 5m6s)  kubelet, worker1   Pulling image "uniqe15/event-sourced-bank:latest"
  Warning  Failed     3m39s (x4 over 5m5s)  kubelet, worker1   Failed to pull image "uniqe15/event-sourced-bank:latest": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/uniqe15/event-sourced-bank:latest": failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:de324984a3ba9bde1a2bc5230ca8754a2d3e055b301a2301bfd9a8115a6822a5: not found
  Warning  Failed     3m39s (x4 over 5m5s)  kubelet, worker1   Error: ErrImagePull
  Normal   BackOff    3m24s (x6 over 5m4s)  kubelet, worker1   Back-off pulling image "uniqe15/event-sourced-bank:latest"
  Warning  Failed     3m13s (x7 over 5m4s)  kubelet, worker1   Error: ImagePullBackOff
@kamilgregorczyk
Copy link
Author

works only with just one arch (armv8 for e.g.)

@dweomer
Copy link
Contributor

dweomer commented Jan 14, 2020

@kamilgregorczyk uniqe15/event-sourced-bank doesn't seem to be a multi-arch image:

Screen Shot 2020-01-14 at 9 36 17 AM


Is the log you provided from an arm64 node? If not, it looks to be working as expected.

@kamilgregorczyk
Copy link
Author

Ah, I changed it to see if arm only would woke, will rebuild in 1-2 hours

@kamilgregorczyk
Copy link
Author

@dweomer done

@davidnuzik davidnuzik added [zube]: To Triage kind/question No code change, just asking/answering a question labels Jan 14, 2020
@davidnuzik davidnuzik added this to the Backlog milestone Jan 14, 2020
@dweomer
Copy link
Contributor

dweomer commented Jan 14, 2020

@kamilgregorczyk these worked for me on a k3OS amd64 VM built from master (k3s v1.0.1):

  • sudo ctr image pull docker.io/uniqe15/event-sourced-bank:latest
  • sudo ctr image pull --platform linux/arm64 docker.io/uniqe15/event-sourced-bank:latest

also rebuilt against k3s v1.0.0 with same results.


have you tried again on both platforms and asserted that things are still broken? if your pod is still broken please run k3s ctr image pull docker.io/uniqe15/event-sourced-bank:latest and share the results.

@kamilgregorczyk
Copy link
Author

I'm reworking my cluster as I had raspberry pi image which limited the cluster to only armv7, I'm now redoing to on ubuntu server to have arm64, will try once it's up and running

@hasheddan
Copy link

@kamilgregorczyk any updates here? I have encountered the same issue using a multi-arch image. Seems this was an underlying issue with containerd that appears to have been fixed here: containerd/containerd#3484

@kamilgregorczyk
Copy link
Author

I created two separate images for different archs

@dalekurt
Copy link

@kamilgregorczyk I'm also interested in your solution if you have one. I am running K3s on Raspberry Pi cluster and I'm trying to deploy rook/ceph Docker image and having the same results.

Model: Raspberry Pi 3 Model B Plus Rev 1.3
Processor model: ARMv7 Processor rev 4 (v7l)
OS: Raspbian Buster
K3S version: v1.17.0+k3s.1

@cloudeyes
Copy link

cloudeyes commented Mar 21, 2020

I have the same issue. I was working on a multi-arch image cloudeyes/hello-node and got this error.

image

image

@David-Igou
Copy link

David-Igou commented Apr 1, 2020

Experiencing this issue on Raspberry Pi 3 B+'s running k3s. Deployed fine on my 4's.

The image I'm experiencing the failure with is cattle-node-agent

@kamilgregorczyk
Copy link
Author

I belive 3b+ has no arm64 support and you need armv7 image for 3B+

@dalekurt
Copy link

Experiencing this issue on Raspberry Pi 3 B+'s running k3s. Deployed fine on my 4's.

The image I'm experiencing the failure with is cattle-node-agent

I recently bought Raspberry 4's and I now have K3s running on them, however I'm experiencing the same issue.

Failed to pull image "rook/ceph:v1.0.6": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/rook/ceph:v1.0.6": failed to unpack image on snapshotter overlayfs: no match for platform in manifest sha256:6c4475a7dc2c123f3f150100824e05d39270b5297ad0594abfe61b492cf5aff7: not found

@onedr0p
Copy link
Contributor

onedr0p commented Apr 10, 2020

@dalekurt what OS are you using on the Pi? Raspbian will not work. That rook/ceph image is amd64/arm64 only.

Edit: You can read my short guide on installing Ubuntu 19.04 on RPis here, or use any other arm64 OS like k3os.
https://github.com/onedr0p/k3s-gitops-arm/blob/master/docs/ubuntu.md

However you are going to have a bad time getting rook/ceph working on arm. It currently will not.

@brandond
Copy link
Member

That means there's no docker.io/rook/ceph:v1.0.6 image for your architecture.

@dalekurt
Copy link

@onedr0p You hit the nail on the head with that, I reinstalled the OS (Ubuntu 64-bit) and I’m in a much better position now. I eventually figured it out while troubleshooting. Thank you

@anirtek
Copy link

anirtek commented Jul 8, 2020

I am also having the same issue while applying a yaml deployment that works absolutely fine with the kubeadm cluster. My docker image is hosted at https://quay.io/repository/cloudian/hap-spark-tf.

But I am getting this error:
Failed to pull image "quay.io/cloudian/hap-spark-tf:0.0.1": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/cloudian/hap-spark-tf:0.0.1": failed to copy: httpReaderSeeker: failed open: failed to do request: Get https://quay.io/v2/cloudian/hap-spark-tf/blobs/sha256:6001bcf3ad9585c9fd3abe302e0f4974f6c9b0f95ae8b34ca234a582452361c0: dial tcp: lookup quay.io: Try again

I am using Centos 7.6 and a 3 node k3s cluster.

@brandond
Copy link
Member

brandond commented Jul 8, 2020

@anirtek you appear to have a different issue - your error suggests some sort of DNS problem.

lookup quay.io: Try again

@stale
Copy link

stale bot commented Jul 30, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jul 30, 2021
@stale stale bot closed this as completed Aug 14, 2021
@vitobotta
Copy link

I am also having the same issue while applying a yaml deployment that works absolutely fine with the kubeadm cluster. My docker image is hosted at https://quay.io/repository/cloudian/hap-spark-tf.

But I am getting this error:
Failed to pull image "quay.io/cloudian/hap-spark-tf:0.0.1": rpc error: code = Unknown desc = failed to pull and unpack image "quay.io/cloudian/hap-spark-tf:0.0.1": failed to copy: httpReaderSeeker: failed open: failed to do request: Get https://quay.io/v2/cloudian/hap-spark-tf/blobs/sha256:6001bcf3ad9585c9fd3abe302e0f4974f6c9b0f95ae8b34ca234a582452361c0: dial tcp: lookup quay.io: Try again

I am using Centos 7.6 and a 3 node k3s cluster.

Hi, I am having this issue now and Google took me here. How did you solve? (Sorry to the others for asking here but I'm kinda desperate)

@Oats87
Copy link
Member

Oats87 commented Aug 15, 2021

@vitobotta if your actual issue is the one reported, you are having what seems to be DNS issues.

Feel free to open a new issue with your exact error if your DNS setup is not your issue.

@vitobotta
Copy link

@vitobotta if your actual issue is the one reported, you are having what seems to be DNS issues.

Feel free to open a new issue with your exact error if your DNS setup is not your issue.

Hi! It was indeed DNS! I was checking the DNS resolution inside the cluster, but k3s/containerd uses the system's resolvers when pulling images, right? It turns out Hetzner's resolvers are not very reliable. Replaced with Cloudflare's and all is good again. Thanks and sorry for adding that question here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question No code change, just asking/answering a question status/stale
Projects
None yet
Development

No branches or pull requests