Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet cannot pull images when using ECR containerProxy asset repository #16762

Open
elliotdobson opened this issue Aug 20, 2024 · 12 comments
Open
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@elliotdobson
Copy link
Contributor

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

Client version: 1.29.2 (git-v1.29.2)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Server Version: v1.29.7

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?
We are configuring local image asset repository however we are running into an issue when trying to update the cluster. We have configured all the ECR private repositories as required.

  1. Enable assets.containerProxy in the Cluster spec
  2. Copy the image assets kops get assets --copy
  3. Update the cluster kops update cluster
  4. Rolling update the first control-plane node kops rolling-update

5. What happened after the commands executed?
New node fails to join the cluster and cluster validation fails.

Upon SSH'ing into the new node and checking the logs via journalctl -u kubelet.service we see that kubelet is unable to pull images from ECR:

Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.005263    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.025018    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.053492    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.077935    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078072    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078174    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078784    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-manager-events-i-0b9bd5da0497cb428_kube-system(404a8513399b8f50110f57caceb4dff4)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"etcd-manager-events-i-0b9bd5da0497cb428_kube-system(404a8513399b8f50110f57caceb4dff4)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428" podUID="404a8513399b8f50110f57caceb4dff4"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084358    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084580    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084647    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084885    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-manager-main-i-0b9bd5da0497cb428_kube-system(e6690dd45219e272fbb9992ff07f37b9)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"etcd-manager-main-i-0b9bd5da0497cb428_kube-system(e6690dd45219e272fbb9992ff07f37b9)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428" podUID="e6690dd45219e272fbb9992ff07f37b9"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.086957    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087058    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087396    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087529    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-apiserver-i-0b9bd5da0497cb428_kube-system(bce2dccacc4b8ea6a460dcd7760b01ed)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-apiserver-i-0b9bd5da0497cb428_kube-system(bce2dccacc4b8ea6a460dcd7760b01ed)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428" podUID="bce2dccacc4b8ea6a460dcd7760b01ed"

6. What did you expect to happen?
kubelet is able to pull images successfully from ECR.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  name: k8s.example.com
spec:
  api:
    loadBalancer:
      class: Classic
      idleTimeoutSeconds: 3600
      type: Internal
  assets:
    containerProxy: 123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example
    fileRepository: https://example-k8s-assets.s3.ap-southeast-2.amazonaws.com/kops
  authentication: {}
  authorization:
    rbac: {}
  certManager:
    enabled: true
  channel: stable
  cloudProvider: aws
...
  containerRuntime: containerd
...
  iam:
    allowContainerRegistry: true
    legacy: false
    useServiceAccountExternalPermissions: true
  kubeAPIServer:
    authorizationMode: Node,RBAC
...
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
    evictionHard: memory.available<10%,nodefs.available<10%,nodefs.inodesFree<5%,imagefs.available<10%,imagefs.inodesFree<5%
  kubernetesVersion: 1.29.7
  networking:
    calico:
      crossSubnet: true
      mtu: 8912
  nodeTerminationHandler:
    enabled: false
  nonMasqueradeCIDR: 100.64.0.0/10
  podIdentityWebhook:
    enabled: true
  serviceAccountIssuerDiscovery:
    discoveryStore: s3://example-k8s-oidc
    enableAWSOIDCProvider: true
...

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
The kubelet log shows that the image credential provider flags are being passed:

Aug 20 05:41:25 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:25.120555    3267 flags.go:64] FLAG: --image-credential-provider-bin-dir="/usr/local/bin"
Aug 20 05:41:25 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:25.120563    3267 flags.go:64] FLAG: --image-credential-provider-config="/var/lib/kubelet/credential-provider.conf"

The ecr-credential-provider binary exists at the location passed to kubelet:

ubuntu@i-0b9bd5da0497cb428:~$ ls -l /usr/local/bin/
total 252944
-rwxr-xr-x 1 root root  55331649 Aug 20 05:39 crictl
-rwxr-xr-x 1 root root  15863808 Aug 20 05:39 ecr-credential-provider
-rwxr-xr-x 1 root root  50225304 Aug 20 05:39 kubectl
-rwxr-xr-x 1 root root 112570628 Aug 20 05:39 kubelet
-rwxr-xr-x 1 root root  25010176 Aug 20 05:39 nerdctl

The credential provider config exists at the location passed to kubelet (and looks valid):

ubuntu@i-0b9bd5da0497cb428:~$ cat /var/lib/kubelet/credential-provider.conf
apiVersion: kubelet.config.k8s.io/v1
kind: CredentialProviderConfig
providers:
  - name: ecr-credential-provider
    matchImages:
      - "*.dkr.ecr.*.amazonaws.com"
      - "*.dkr.ecr.*.amazonaws.com.cn"
      - "*.dkr.ecr-fips.*.amazonaws.com"
      - "*.dkr.ecr.us-iso-east-1.c2s.ic.gov"
      - "*.dkr.ecr.us-isob-east-1.sc2s.sgov.gov"
    defaultCacheDuration: "12h"
    apiVersion: credentialprovider.kubelet.k8s.io/v1
    args:
      - get-credentials

Seems like a similar issue as #13494 however there was no clear resolution in that issue (and we are not using the AWS China partition).

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 20, 2024
@elliotdobson
Copy link
Contributor Author

Another similar issue - #13377. And a slack thread on #kops-users.

@rifelpet
Copy link
Member

@elliotdobson When SSH'ed into a problematic node can you run this command? Substituting the image for one that you expect to work. I'm curious if the response contains valid credentials or not.

echo '{"apiVersion":"credentialprovider.kubelet.k8s.io/v1","kind":"CredentialProviderRequest","image":"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9"}' | ecr-credential-provider get-credentials

@elliotdobson
Copy link
Contributor Author

elliotdobson commented Aug 25, 2024

@rifelpet it works fine. I can pass those credentials into crictl pull and strangely it says the image is up to date...

$ echo '{"apiVersion":"credentialprovider.kubelet.k8s.io/v1","kind":"CredentialProviderRequest","image":"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9"}' | ecr-credential-provider get-credentials | jq '.auth | to_entries[].value | "\(.username):\(.password)"' | sudo xargs -I {} crictl pull --creds {} 123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9
Image is up to date for sha256:e6f1816883972d4be47bd48879a08919b96afcd344132622e4d444987919323c

and the node has all the required images...

$ sudo crictl images
IMAGE                                                                                                           TAG                                            IMAGE ID            SIZE
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/calico/cni                                        <none>                                         6527a35581401       88.4MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/calico/node                                       <none>                                         5c6ffd2b2a1d0       116MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/dns/k8s-dns-node-cache                            <none>                                         c65d25696473d       34.8MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/ebs-csi-driver/aws-ebs-csi-driver                 <none>                                         d0b811ee8b120       29.2MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/eks-distro/kubernetes-csi/livenessprobe           <none>                                         0f33636f3e138       8.06MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/eks-distro/kubernetes-csi/node-driver-registrar   <none>                                         1e017ee0e9e78       6.78MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/etcd                                              <none>                                         0369cf4303ffd       86.7MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/etcd                                              <none>                                         1b2ba9f3d2043       57.1MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/etcdadm/etcd-manager-slim                         <none>                                         98a2527bc1dbc       49.3MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kops/kops-controller                              <none>                                         5da88c7961ee1       50MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kops/kops-utils-cp                                <none>                                         3a09361fb8252       2.29MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kops/kube-apiserver-healthcheck                   <none>                                         9cd5ecc0d5313       5.5MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kube-apiserver                                    <none>                                         a2e0d7fa8464a       35.2MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kube-controller-manager                           <none>                                         32fe966e5c2b2       33.8MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kube-proxy                                        <none>                                         cc8c46cf9d741       28.6MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/kube-scheduler                                    <none>                                         9cffb486021b3       18.9MB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause                                             3.9                                            e6f1816883972       322kB
123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/provider-aws/cloud-controller-manager             <none>                                         2a3fef77df4ae       21.2MB

The node has also successfully joined the cluster and kOps is validating OK... 🤔 There is still the same pull image error message in the kubelet log (for the first 10 minutes of the nodes life), but it seems to eventually get over that, pulls the images and starts the containers successfully. So I guess that is a red-herring.

I'll roll the rest of the cluster and report back if it's working.

@elliotdobson
Copy link
Contributor Author

I rolled the second control-plane node in the cluster and it failed to join the cluster within the default kOps validation timeout (15 mins).

When I SSH into the second control-plane node it has no container images present, kubelet logs are filled with pull image error messages (same as original post). However if I pull the pause image manually then everything continues as expected & the node joins the cluster. Interestingly when I pull the pause image manually it says the image is already up to date (however it was not present when I first logged into the node):

$ echo '{"apiVersion":"credentialprovider.kubelet.k8s.io/v1","kind":"CredentialProviderRequest","image":"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9"}' | ecr-credential-provider get-credentials | jq '.auth | to_entries[].value | "\(.username):\(.password)"' | sudo xargs -I {} crictl pull --creds {} 123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9
Image is up to date for sha256:e6f1816883972d4be47bd48879a08919b96afcd344132622e4d444987919323c

So definitely some issue around the AWS credential provider, containerProxy, and the PodInfraContainerImage (pause image) interaction with node bootstrap.

@elliotdobson
Copy link
Contributor Author

Out of curiosity I tried rolling a worker node and it successfully joined the cluster but it cannot pull the pause image. So all containers on the node are stuck in pending state and eventually kOps validation times out.

@rifelpet
Copy link
Member

Out of curiosity I tried rolling a worker node and it successfully joined the cluster but it cannot pull the pause image. So all containers on the node are stuck in pending state and eventually kOps validation times out.

What is the error message when pulling the pause image?

@elliotdobson
Copy link
Contributor Author

What is the error message when pulling the pause image?

The same as reported in the original post:

Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.005263    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.025018    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: I0820 05:41:26.053492    3267 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.077935    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078072    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078174    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.078784    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-manager-events-i-0b9bd5da0497cb428_kube-system(404a8513399b8f50110f57caceb4dff4)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"etcd-manager-events-i-0b9bd5da0497cb428_kube-system(404a8513399b8f50110f57caceb4dff4)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/etcd-manager-events-i-0b9bd5da0497cb428" podUID="404a8513399b8f50110f57caceb4dff4"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084358    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084580    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084647    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.084885    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"etcd-manager-main-i-0b9bd5da0497cb428_kube-system(e6690dd45219e272fbb9992ff07f37b9)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"etcd-manager-main-i-0b9bd5da0497cb428_kube-system(e6690dd45219e272fbb9992ff07f37b9)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/etcd-manager-main-i-0b9bd5da0497cb428" podUID="e6690dd45219e272fbb9992ff07f37b9"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.086957    3267 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087058    3267 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087396    3267 kuberuntime_manager.go:1182] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to pull and unpack image \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": failed to resolve reference \"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428"
Aug 20 05:41:26 i-0b9bd5da0497cb428 kubelet[3267]: E0820 05:41:26.087529    3267 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-apiserver-i-0b9bd5da0497cb428_kube-system(bce2dccacc4b8ea6a460dcd7760b01ed)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-apiserver-i-0b9bd5da0497cb428_kube-system(bce2dccacc4b8ea6a460dcd7760b01ed)\\\": rpc error: code = Unknown desc = failed to get sandbox image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause:3.9@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to pull and unpack image \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": failed to resolve reference \\\"123456789101.dkr.ecr.ap-southeast-2.amazonaws.com/k8s-example/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\\\": pull access denied, repository does not exist or may require authorization: authorization failed: no basic auth credentials\"" pod="kube-system/kube-apiserver-i-0b9bd5da0497cb428" podUID="bce2dccacc4b8ea6a460dcd7760b01ed"

And similar access denied in the containerd logs too (presumably kubelet just supersets containerd logs)

@rifelpet
Copy link
Member

What about echo ... | ecr-credential-provider on that node? any additional logs that could be relevant?

@elliotdobson
Copy link
Contributor Author

What about echo ... | ecr-credential-provider on that node?

Works fine, I get valid credentials back that I can then use to pull images from private ECR.

any additional logs that could be relevant?

Not that I know of. Do you have any that you suggest?

In the slack thread I found @olemarkus had a few comments that seem to point to the issue:

image

Specifically when containerd tries to pull the pause image it does not receive ECR credentials from kubelet since the reason for the pull is not a Pod? Could that be the underlying issue?

@olemarkus
Copy link
Member

The issue is indeed the same.
When kubelet initiates a container pull, it will use the ECR plugin and pass this on to containerd.
Containerd has no knowledge of neither the ECR plugin or even kubelet. When it tries to pull an image it will use what is configured in the containerd config as is.

@rifelpet
Copy link
Member

The easiest workaround is to call the credential helper directly and pass the credentials to crictl and pull the image, as suggested here: containerd/containerd#6637 (comment)

Bottlerocket does the same: bottlerocket-os/bottlerocket#382

You might be able to do this with additionalUserData but need to ensure it runs after containerd is running, so it may take adding a systemd service that depends on containerd.

A long term solution would be for kops' nodeup to pull the podInfraContainerImage explicitly whenever containerProxy is set.

@elliotdobson
Copy link
Contributor Author

Ok so the root issue is that containerd pulls the sandbox image anonymously, and so the ultimate fix for this would need to come in containerd to enable that. (I have commented on the containerd issue that you linked)


In the meantime though...

I like your idea about using additionalUserData however I think a more scalable option is to pull the sandbox image manually via kOps hooks which can depend on systemd services etc. Though this option requires us to remember to update the sandbox image version in the hook when it changes in kOps.

a long-term workaround (like you say) would be for kOps nodeup to pull the sandbox image (podInfraContainerImage) explicitly whenever containerProxy or containerRegistry is set.


Perhaps another alternative to kOps Container Image Asset Repository would be containerd Registry Mirror as per #16593. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants