Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usr/local/bin/nerdctl: not found when running kubespray with vagrant #10268

Open
romch007 opened this issue Jun 30, 2023 · 24 comments
Open

usr/local/bin/nerdctl: not found when running kubespray with vagrant #10268

romch007 opened this issue Jun 30, 2023 · 24 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. triage/not-reproducible Indicates an issue can not be reproduced as described.

Comments

@romch007
Copy link

I am trying to install kubespray using the provided Vagrantfile. The only changes I made were :

 $num_instances ||= 3
 $instance_name_prefix ||= "k8s"
 $vm_gui ||= false
-$vm_memory ||= 2048
-$vm_cpus ||= 2
+$vm_memory ||= 4096
+$vm_cpus ||= 3
 $shared_folders ||= {}
 $forwarded_ports ||= {}
-$subnet ||= "172.18.8"
+$subnet ||= "192.168.56"
 $subnet_ipv6 ||= "fd3c:b398:0698:0756"
 $os ||= "ubuntu2004"
 $network_plugin ||= "flannel"
@@ -254,6 +254,7 @@ Vagrant.configure("2") do |config|
       # And limit the action to gathering facts, the full playbook is going to be ran by testcases_run.sh
       if i == $num_instances
         node.vm.provision "ansible" do |ansible|
+          ansible.compatibility_mode = "2.0"
           ansible.playbook = $playbook
           ansible.verbose = $ansible_verbosity
           $ansible_inventory_path = File.join( $inventory, "hosts.ini")

All the other files of the repo are unchanged.

Environment:

  • Cloud provider or hardware configuration: VirtualBox 7.0.8

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):

Linux 6.3.9-arch1-1 x86_64
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
PRIVACY_POLICY_URL="https://terms.archlinux.org/docs/privacy-policy/"
LOGO=archlinux-logo
  • Version of Ansible (ansible --version):
ansible [core 2.15.1]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/romain/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.11/site-packages/ansible
  ansible collection location = /home/romain/.ansible/collections:/usr/share/ansible/collections
  executable location = /bin/ansible
  python version = 3.11.3 (main, Jun  5 2023, 09:32:32) [GCC 13.1.1 20230429] (/usr/bin/python)
  jinja version = 3.1.2
  libyaml = True
  • Version of Python (python --version): Python 3.11.3

Kubespray version (commit) (git rev-parse --short HEAD): b42757d33

Network plugin used: flannel

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible: vagrant up

Output of ansible run:

On every node:

TASK [download : download_container | Load image into the local container registry]
fatal: [k8s-1]: FAILED! => {"changed": true, "cmd": "/usr/local/bin/nerdctl -n k8s.io image load < /tmp/releases/images/docker.io_flannel_flannel_v0.22.0.tar", "delta": "0:00:00.004128", "end": "2023-06-30 16:34:15.315823", "failed_when_result": true, "msg": "non-zero return code", "rc": 127, "start": "2023-06-30 16:34:15.311695", "stderr": "/bin/sh: 1: /usr/local/bin/nerdctl: not found", "stderr_lines": ["/bin/sh: 1: /usr/local/bin/nerdctl: not found"], "stdout": "", "stdout_lines": []}
fatal: [k8s-2]: same
fatal: [k8s-3]: same

Anything else do we need to know:

@romch007 romch007 added the kind/bug Categorizes issue or PR as related to a bug. label Jun 30, 2023
@romch007 romch007 changed the title `usr/local/bin/nerdctl: not found when starting kubespray with vagrant usr/local/bin/nerdctl: not found when starting kubespray with vagrant Jun 30, 2023
@romch007 romch007 changed the title usr/local/bin/nerdctl: not found when starting kubespray with vagrant usr/local/bin/nerdctl: not found when running kubespray with vagrant Jun 30, 2023
@wolskies
Copy link

wolskies commented Jul 2, 2023

I'm seeing the same behavior with Debian 12 VMs hosted by proxmox. Kubespray downloads nerdctl (among others) correctly to /tmp/releases then doesn't copy to /usr/local/bin - skips right to trying to pull the flannel image with nertctl and gets a [Errno 2] No such file or directory: b'/usr/local/bin/nerdctl

The full traceback is:
  File "/tmp/ansible_ansible.legacy.command_payload_uw5dw5yf/ansible_ansible.legacy.command_payload.zip/ansible/module_utils/basic.py", line 2030, in run_command
    cmd = subprocess.Popen(args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 1024, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.11/subprocess.py", line 1901, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
fatal: [node1]: FAILED! => {
    "attempts": 4,
    "changed": false,
    "cmd": "/usr/local/bin/nerdctl -n k8s.io pull --quiet docker.io/flannel/flannel:v0.22.0",
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/nerdctl -n k8s.io pull --quiet  docker.io/flannel/flannel:v0.22.0",
            "_uses_shell": false,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true
        }
    },
    "msg": "[Errno 2] No such file or directory: b'/usr/local/bin/nerdctl'",
    "rc": 2,
    "stderr": "",
    "stderr_lines": [],
    "stdout": "",
    "stdout_lines": []

@mickaelmonsieur
Copy link

mickaelmonsieur commented Jul 3, 2023

Same errors with Debian 11.7 , Vagrant 2.3.7 and Virtualbox 7.0.8.

@mickaelmonsieur
Copy link

Small fix:

cp /tmp/releases/nerdctl /usr/local/bin/nerdctl && cp /tmp/releases/crictl /usr/local/bin/crictl

and relaunch ansible.

@yankay
Copy link
Member

yankay commented Jul 4, 2023

Thanks @romch007 @mickaelmonsieur

If glad to, feel free to provide a PR. :-)

Thank you very much.

@wolskies
Copy link

wolskies commented Jul 4, 2023

I did that, it gets past the immediate problem with nerdctl not being in /usr/local/bin, but fails later trying to create the kubeadm token (on all nodes) - I think it's related (seems like nerdctl, crictl and runc get downloaded but not configured):

TASK [kubernetes/control-plane : Create kubeadm token for joining nodes with 24h expiration (default)] ****************************************** task path: /Users/ed/Kube/kubespray/roles/kubernetes/control-plane/tasks/kubeadm-setup.yml:207 fatal: [node2 -> node1(192.168.1.73)]: FAILED! => { "attempts": 5, "changed": false, "cmd": [ "/usr/local/bin/kubeadm", "--kubeconfig", "/etc/kubernetes/admin.conf", "token", "create" ], "delta": "0:01:15.109430", "end": "2023-07-04 02:54:04.922118", "invocation": { "module_args": { "_raw_params": "/usr/local/bin/kubeadm --kubeconfig /etc/kubernetes/admin.conf token create", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "stdin_add_newline": true, "strip_empty_ends": true } }, "msg": "non-zero return code", "rc": 1, "start": "2023-07-04 02:52:49.812688", "stderr": "timed out waiting for the condition\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": [ "timed out waiting for the condition", "To see the stack trace of this error execute with --v=5 or higher" ], "stdout": "", "stdout_lines": []

journalctl shows something wrong with the configuration of runc:

`sudo journalctl -xeu kubelet | grep failed
Jul 04 16:24:50 node1 kubelet[127545]: E0704 16:24:50.524302 127545 remote_runtime.go:176] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/363edeefd37098196f7b4bd3baa2253e932f3501bdd97b083d0c8fceba6138e7/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown"
Jul 04 16:24:50 node1 kubelet[127545]: E0704 16:24:50.524363 127545 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/363edeefd37098196f7b4bd3baa2253e932f3501bdd97b083d0c8fceba6138e7/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown" pod="kube-system/kube-apiserver-node1"
Jul 04 16:24:50 node1 kubelet[127545]: E0704 16:24:50.524386 127545 kuberuntime_manager.go:782] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/363edeefd37098196f7b4bd3baa2253e932f3501bdd97b083d0c8fceba6138e7/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown" pod="kube-system/kube-apiserver-node1"
Jul 04 16:24:50 node1 kubelet[127545]: E0704 16:24:50.524432 127545 pod_workers.go:965] "Error syncing pod, skipping" err="failed to "CreatePodSandbox" for "kube-apiserver-node1_kube-system(c4b89dde2a5c1b5d448fe0f03d05baa8)" with CreatePodSandboxError: "Failed to create sandbox for pod \"kube-apiserver-node1_kube-system(c4b89dde2a5c1b5d448fe0f03d05baa8)\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/363edeefd37098196f7b4bd3baa2253e932f3501bdd97b083d0c8fceba6138e7/log.json: no such file or directory): exec: \"runc\": executable file not found in $PATH: unknown"" pod="kube-system/kube-apiserver-node1" podUID=c4b89dde2a5c1b5d448fe0f03d05baa8
Jul 04 16:24:50 node1 kubelet[127545]: E0704 16:24:50.619838 127545 controller.go:146] failed to ensure lease exists, will retry in 7s, error: Get "https://192.168.1.73:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/node1?timeout=10s": dial tcp 192.168.1.73:6443: connect: connection refused
Jul 04 16:24:52 node1 kubelet[127545]: W0704 16:24:52.667772 127545 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.RuntimeClass: Get "https://192.168.1.73:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 192.168.1.73:6443: connect: connection refused
Jul 04 16:24:52 node1 kubelet[127545]: E0704 16:24:52.667836 127545 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://192.168.1.73:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 192.168.1.73:6443: connect: connection refused
Jul 04 16:24:52 node1 kubelet[127545]: W0704 16:24:52.675521 127545 reflector.go:424] vendor/k8s.io/client-go/informers/factory.go:150: failed to list *v1.CSIDriver: Get "https://192.168.1.73:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 192.168.1.73:6443: connect: connection refused
Jul 04 16:24:52 node1 kubelet[127545]: E0704 16:24:52.675585 127545 reflector.go:140] vendor/k8s.io/client-go/informers/factory.go:150: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: Get "https://192.168.1.73:6443/apis/storage.k8s.io/v1/csidrivers?limit=500&resourceVersion=0": dial tcp 192.168.1.73:6443: connect: connection refused
Jul 04 16:24:53 node1 kubelet[127545]: E0704 16:24:53.520427 127545 remote_runtime.go:176] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/d4fb1e974177c6372785c1b4a8e242e55516580b9309a1407fc470f106387820/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown"
Jul 04 16:24:53 node1 kubelet[127545]: E0704 16:24:53.520464 127545 kuberuntime_sandbox.go:72] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/d4fb1e974177c6372785c1b4a8e242e55516580b9309a1407fc470f106387820/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown" pod="kube-system/kube-controller-manager-node1"
Jul 04 16:24:53 node1 kubelet[127545]: E0704 16:24:53.520486 127545 kuberuntime_manager.go:782] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/d4fb1e974177c6372785c1b4a8e242e55516580b9309a1407fc470f106387820/log.json: no such file or directory): exec: "runc": executable file not found in $PATH: unknown" pod="kube-system/kube-controller-manager-node1"
Jul 04 16:24:53 node1 kubelet[127545]: E0704 16:24:53.520525 127545 pod_workers.go:965] "Error syncing pod, skipping" err="failed to "CreatePodSandbox" for "kube-controller-manager-node1_kube-system(84983840101f64a28c6328ab55dc5c58)" with CreatePodSandboxError: "Failed to create sandbox for pod \"kube-controller-manager-node1_kube-system(84983840101f64a28c6328ab55dc5c58)\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v2.task/k8s.io/d4fb1e974177c6372785c1b4a8e242e55516580b9309a1407fc470f106387820/log.json: no such file or directory): exec: \"runc\": executable file not found in $PATH: unknown"" pod="kube-system/kube-controller-manager-node1" podUID=84983840101f64a28c6328ab55dc5c58
Jul 04 16:24:57 node1 kubelet[127545]: E0704 16:24:57.620471 127545 controller.go:146] failed to ensure lease exists, will retry in 7s, error: Get "https://192.168.1.73:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/node1?timeout=10s": dial tcp 192.168.1.73:6443: connect: connection refused

My guess is it's related to the missing nerdctl thing - playbook seems to skip over configuration for nerdctl, crictl and possibly runc

@blackmesa-peterdohm
Copy link

blackmesa-peterdohm commented Jul 11, 2023

Just to be clear, this is a total show-stopper for any use many (i'd argue most) people will have using on-premise kubespray at present. I'm trying to use this to build a cluster with calico on ubuntu; quite vanilla, really... How are no regression tests covering this? I've spent hours trying to figure out how those steps are being "skipped" and from what i can tell, it's not that they're skipped, it's that the configurations happen well later...

@blackmesa-peterdohm
Copy link

Just to be clear, this is a total show-stopper for any use many (i'd argue most) people will have using on-premise kubespray at present. I'm trying to use this to build a cluster with calico on ubuntu; quite vanilla, really... How are no regression tests covering this? I've spent hours trying to figure out how those steps are being "skipped" and from what i can tell, it's not that they're skipped, it's that the configurations happen well later...

FALSE ALARM. I'd run ansible outside the virtualenvironment. So, this is a very curious failure mode that occurs if you do what i just did, in case anyone else runs into this....

@slappyslap
Copy link

Got same error with master branch and debian 12

@wolskies
Copy link

From my perspective, it isn't a false alarm. I ran Ansible per the installation instructions, from inside the VENV and it continues to fail to configure nerdct/etc. I've tried with Debian 12 & Oracle/Rocky and get the same behavior - both on "bare metal" and VMs.

@slappyslap
Copy link

slappyslap commented Jul 24, 2023 via email

@Khodesaeed
Copy link

get the same error on ubuntu 20.04.

@Mishavint
Copy link
Contributor

Mishavint commented Jul 28, 2023

Faced similar problem with VBox
Quick fix that helped in my case:

- name: Configure hosts
  gather_facts: False
  hosts: k8s_cluster
  tasks:
    - name: Create a symbolic link
      ansible.builtin.file:
        src: /tmp/releases/crictl
        dest: /usr/local/bin/crictl
        state: link
        force: true

    - name: Create a symbolic link
      ansible.builtin.file:
        src: /tmp/releases/nerdctl
        dest: /usr/local/bin/nerdctl
        state: link
        force: true

    - name: Create a symbolic link
      ansible.builtin.file:
        src: /tmp/releases/runc-v1.1.7.amd64
        dest: /usr/local/bin/runc
        state: link
        force: true

Just add this to playbooks/cluster.yml

Somehow, Kubespray doesn't copy nerdctl, crictl and runc to /usr/local/bin. So i just make a soft link

@Khodesaeed
Copy link

Khodesaeed commented Jul 29, 2023

After some investigation, I guess somehow the dependency roles of container-engine which is containerd don't run after containerd CRI selects.
According to the Ansible documentation about role dependencies (link): Role dependencies let you automatically pull in other roles when using a role.
And the doc says: Ansible always executes roles listed in dependencies before the role that lists them.
Moreover, you can find the containerd or any other CRI-related role dependencies on this path: roles/container-engine/meta/main.yml
which is the code snippet below related to containerd:

---
dependencies:
...
  - role: container-engine/containerd
    when:
      - container_manager == 'containerd'
    tags:
      - container-engine
      - containerd

Following the same pattern, this role has some role dependencies and at this point the runc, crictl, and, nerdctl related tasks must run, but they didn't:
the role-dependency meta file on path roles/container-engine/containerd/meta/main.yml:

---
dependencies:
  - role: container-engine/containerd-common
  - role: container-engine/runc
  - role: container-engine/crictl
  - role: container-engine/nerdctl

So, here is my Quick fix:
I added the required task for installing containerd on role-dependency meta file before the containerd section on this path roles/container-engine/meta/main.yml:

---
dependencies:
...
  - role: container-engine/runc
    when:
      - container_manager == 'containerd'

  - role: container-engine/nerdctl
    when:
      - container_manager == 'containerd'
  
  - role: container-engine/crictl
    when:
      - container_manager == 'containerd'

  - role: container-engine/containerd
    when:
      - container_manager == 'containerd'
    tags:
      - container-engine
      - containerd

P.S After some more investigation I found another Bug that I think that's my main issue and it was that after using the reset.yml playbook to reset the cluster some container process still remains and after killing those containers finally accomplished to deploy my cluster with Kubespray.

@yankay
Copy link
Member

yankay commented Aug 16, 2023

After some investigation, I guess somehow the dependency roles of container-engine which is containerd don't run after containerd CRI selects. According to the Ansible documentation about role dependencies (link): Role dependencies let you automatically pull in other roles when using a role. And the doc says: Ansible always executes roles listed in dependencies before the role that lists them. Moreover, you can find the containerd or any other CRI-related role dependencies on this path: roles/container-engine/meta/main.yml which is the code snippet below related to containerd:

---
dependencies:
...
  - role: container-engine/containerd
    when:
      - container_manager == 'containerd'
    tags:
      - container-engine
      - containerd

Following the same pattern, this role has some role dependencies and at this point the runc, crictl, and, nerdctl related tasks must run, but they didn't: the role-dependency meta file on path roles/container-engine/containerd/meta/main.yml:

---
dependencies:
  - role: container-engine/containerd-common
  - role: container-engine/runc
  - role: container-engine/crictl
  - role: container-engine/nerdctl

So, here is my Quick fix: I added the required task for installing containerd on role-dependency meta file before the containerd section on this path roles/container-engine/meta/main.yml:

---
dependencies:
...
  - role: container-engine/runc
    when:
      - container_manager == 'containerd'

  - role: container-engine/nerdctl
    when:
      - container_manager == 'containerd'
  
  - role: container-engine/crictl
    when:
      - container_manager == 'containerd'

  - role: container-engine/containerd
    when:
      - container_manager == 'containerd'
    tags:
      - container-engine
      - containerd

P.S After some more investigation I found another Bug that I think that's my main issue and it was that after using the reset.yml playbook to reset the cluster some container process still remains and after killing those containers finally accomplished to deploy my cluster with Kubespray.

Thanks @Khodesaeed @roboticsbrian

I cannot find the root cause of the issue.
Would you help us reproduce the issue?

Which config file, kubespray commit and OS are used, and is there any important step to reproduce the issue?

@yankay yankay added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Aug 16, 2023
@RomainMou
Copy link
Contributor

Hi,

After some investigation, it could be linked with how dependencies work. It's not uniform across all Ansible version when using when. These Ansible issues could be relevant:

this is normal and expected behavior for meta dependencies, de duplication is done on the 'call signature' of the role itself. If you want finer grained control I would recommend using include_role instead.

I've started replacing all dependencies by include_roles and import_roles to avoid this. I can do a PR if you think this is a right approch @yankay.

@yankay
Copy link
Member

yankay commented Aug 16, 2023

Hi,

After some investigation, it could be linked with how dependencies work. It's not uniform across all Ansible version when using when. These Ansible issues could be relevant:

this is normal and expected behavior for meta dependencies, de duplication is done on the 'call signature' of the role itself. If you want finer grained control I would recommend using include_role instead.

I've started replacing all dependencies by include_roles and import_roles to avoid this. I can do a PR if you think this is a right approch @yankay.

Thanks @RomainMou

I do not know how to reproduce it, so have no idea about does it a right approch temporarily. :-)
Does the issue would be reproduced in the ansible >= [core 2.15.x] ?

@RomainMou
Copy link
Contributor

Yes @yankay, I've reproduced it on a new cluster installation with:

ansible==8.2.0
ansible-core==2.15.3

@yankay
Copy link
Member

yankay commented Aug 16, 2023

Thank you @RomainMou

I upgrade the ansible to

ansible==8.3.0

the issue is reproduced.

fatal: [kay171]: FAILED! => {"attempts": 4, "changed": false, "cmd": "/usr/local/bin/nerdctl -n k8s.io pull --quiet quay.m.daocloud.io/calico/node:v3.25.1", "msg": "[Errno 2] No such file or directory: b'/usr/local/bin/nerdctl'", "rc": 2, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
fatal: [kay172]: FAILED! => {"attempts": 4, "changed": false, "cmd": "/usr/local/bin/nerdctl -n k8s.io pull --quiet quay.m.daocloud.io/calico/node:v3.25.1", "msg": "[Errno 2] No such file or directory: b'/usr/local/bin/nerdctl'", "rc": 2, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

I think supporting a new ansible version is very good for kubespray.
It's very welcome to provide a PR to fix it.

@MrFreezeex @floryut, Would you please give some suggestions :-)
Thanks.

@bugaian
Copy link

bugaian commented Aug 18, 2023

ansible -i inventory/mycluster/inventory.ini -u ubuntu --private-key=~/.ssh/id_rsa --become --become-user=root -b -m copy -a "src=/tmp/releases/nerdctl dest=/usr/local/bin/nerdctl mode=0755 remote_src=yes" all


ansible -i inventory/mycluster/inventory.ini -u ubuntu --private-key=~/.ssh/id_rsa --become --become-user=root -b -m copy -a "src=/tmp/releases/crictl dest=/usr/local/bin/crictl mode=0755 remote_src=yes" all

These lines fixed it on all nodes.

@vyom-soft
Copy link

vyom-soft commented Nov 8, 2023

Hi,
After some investigation, it could be linked with how dependencies work. It's not uniform across all Ansible version when using when. These Ansible issues could be relevant:

this is normal and expected behavior for meta dependencies, de duplication is done on the 'call signature' of the role itself. If you want finer grained control I would recommend using include_role instead.

I've started replacing all dependencies by include_roles and import_roles to avoid this. I can do a PR if you think this is a right approch @yankay.

Thanks @RomainMou

I do not know how to reproduce it, so have no idea about does it a right approch temporarily. :-) Does the issue would be reproduced in the ansible >= [core 2.15.x] ?

It is reproducible with Ansible 2.15.4. Today I hit this error.

The full traceback is:
  File "/tmp/ansible_ansible.legacy.command_payload_nox4f_k6/ansible_ansible.legacy.command_payload.zip/ansible/module_utils/basic.py", line 2038, in run_command
    cmd = subprocess.Popen(args, **kwargs)
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
fatal: [kvt0labvrpa0049]: FAILED! => {
    "attempts": 4,
    "changed": false,
    "cmd": "/usr/local/bin/nerdctl -n k8s.io pull --quiet quay.io/calico/node:v3.26.3",
    "invocation": {
        "module_args": {
            "_raw_params": "/usr/local/bin/nerdctl -n k8s.io pull --quiet quay.io/calico/node:v3.26.3",
            "_uses_shell": false,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true
        }
    },
    "msg": "[Errno 2] No such file or directory: b'/usr/local/bin/nerdctl': b'/usr/local/bin/nerdctl'",
    "rc": 2,
    "stderr": "",
    "stderr_lines": [],
    "stdout": "",
    "stdout_lines": []
}

@MrFreezeex
Copy link
Member

MrFreezeex commented Nov 8, 2023

It is reproducible with Ansible 2.15.4. Today I hit this error.

Hi! not sure how you launched kubespray with ansible 2.15.4 but with definitely does not support this version! Please use requirements.txt to install your ansible version

@VannTen
Copy link
Contributor

VannTen commented Jan 22, 2024

Still reproducible with latest master ?

@VannTen
Copy link
Contributor

VannTen commented Feb 8, 2024

/triage not-reproducible
I could not reproduce this on master
(please provide a reproducer if that's incorret)

@k8s-ci-robot k8s-ci-robot added the triage/not-reproducible Indicates an issue can not be reproduced as described. label Feb 8, 2024
@user81230
Copy link
Contributor

May be connected to the issue: after clean installation on Oracle Linux 9, /usr/local/bin was simply not present in $PATH, so I couldn't use binaries (nerdctl included) from my user without specifying the full path to it. This was not affecting the installation process, though - everything worked as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. triage/not-reproducible Indicates an issue can not be reproduced as described.
Projects
None yet
Development

Successfully merging a pull request may close this issue.