Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

How to re-run stopped VMs? #504

Open
shinebayar-g opened this issue Dec 14, 2019 · 4 comments
Open

How to re-run stopped VMs? #504

shinebayar-g opened this issue Dec 14, 2019 · 4 comments
Labels
area/networking Issues related to networking area/runtime Issues related to container runtimes kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@shinebayar-g
Copy link

  1. Why reboot VMs doesn't work?
  2. How can I re-start stopped VMs?
# sudo ignite vm start my-vm
FATA[0000] failed to start container for VM "989923904ae4eb43": task must be stopped before deletion: running: failed precondition
@stealthybox stealthybox added the kind/bug Categorizes issue or PR as related to a bug. label Feb 7, 2020
@stealthybox
Copy link
Contributor

Thanks for reporting.
This is probably a bug.
We should write an e2e test and see when it was introduced.

@kobayashi
Copy link
Contributor

Which ignite version are you using? For my environment, stop/start tasks are working fine.

$ sudo ignite run weaveworks/ignite-ubuntu --ssh --name my-vm
INFO[0001] Created VM with ID "ba6509c56933aa45" and name "my-vm" 
INFO[0001] Networking is handled by "cni"               
INFO[0001] Started Firecracker VM "ba6509c56933aa45" in a container with ID "ignite-ba6509c56933aa45" 
$ sudo ignite stop my-vm
INFO[0000] Removing the container with ID "ignite-ba6509c56933aa45" from the "cni" network 
INFO[0000] Stopped VM with name "my-vm" and ID "ba6509c56933aa45" 
$ sudo ignite start my-vm
INFO[0000] Networking is handled by "cni"               
INFO[0000] Started Firecracker VM "ba6509c56933aa45" in a container with ID "ignite-ba6509c56933aa45" 
$ sudo ignite vm ps
VM ID			IMAGE				KERNEL					SIZE	CPUS	MEMORYCREATED	STATUS	IPS		PORTS	NAME
ba6509c56933aa45	weaveworks/ignite-ubuntu:latest	weaveworks/ignite-kernel:4.19.47	4.0 GB	1	512.0 MB	23s ago	Up 8s	10.61.0.32		my-vm
$ sudo ignite version
Ignite version: version.Info{Major:"0", Minor:"6+", GitVersion:"v0.6.0-235+4b20e4f5cd8c58-dirty-dirty", GitCommit:"4b20e4f5cd8c582daf7fb8cbc0359d3ccab6b5bb", GitTreeState:"dirty", BuildDate:"2020-04-20T20:29:28Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"linux/amd64"}
Firecracker version: v0.21.1
Runtime: containerd

@stealthybox
Copy link
Contributor

Thanks for confirming on recent versions of ignite @kobayashi.

I think this issue may be related to firecracker's special handling of os reboots.
If I recall correctly, reboots simply shutdown the guest.

It would be nice if you could ignite vm start these guests again.
That doesn't look to work right now using a build off of master, although the error message I get is about IP allocation which is different from the @shinebayar-g's:

sudo ignite run weaveworks/ignite-ubuntu \
          --name test-reboot --ssh
INFO[0000] Created VM with ID "d81cff4cad6db24c" and name "test-reboot"
INFO[0001] Networking is handled by "cni"
INFO[0001] Started Firecracker VM "d81cff4cad6db24c" in a container with ID "ignite-d81cff4cad6db24c"

sudo ignite exec test-reboot echo hi
hi

sudo ignite exec test-reboot reboot
ERRO[0000] failed to run shell command: wait: remote command exited without exit status or exit signal

sudo ignite exec test-reboot echo down
FATA[0000] VM "d81cff4cad6db24c" is not running

sudo ignite vm start test-reboot
ERRO[0000] failed to setup network for namespace "ignite-d81cff4cad6db24c": failed to allocate for range 0: 10.61.0.3 has been allocated to ignite-d81cff4cad6db24c, duplicate allocation is not allowed
FATA[0000] failed to allocate for range 0: 10.61.0.3 has been allocated to ignite-d81cff4cad6db24c, duplicate allocation is not allowed

sudo ignite vm rm test-reboot
INFO[0000] Removing the container with ID "ignite-d81cff4cad6db24c" from the "cni" network
INFO[0000] Removed VM with name "test-reboot" and ID "d81cff4cad6db24c"
ignite version
Ignite version: version.Info{Major:"0", Minor:"6+", GitVersion:"v0.6.0-264+ae1cd8a48d9372", GitCommit:"ae1cd8a48d937235f0e36923a5bbb0028d02d5d4", GitTreeState:"clean", BuildDate:"2020-05-18T23:34:12Z", GoVersion:"go1.14.2", Compiler:"gc", Platform:"linux/amd64", SandboxImage:version.Image{Name:"weaveworks/ignite", Tag:"v0.6.0-264-ae1cd8a48d9372", Delimeter:":"}, KernelImage:version.Image{Name:"weaveworks/ignite-kernel", Tag:"4.19.47", Delimeter:":"}}
Firecracker version: v0.21.1
Runtime: containerd

@stealthybox
Copy link
Contributor

Containerd lifecycle is still failing with the most recent network lifecycle change. (c17a99c)
On the ignite dev call, we discussed checking in some e2e tests with the goal of fixing these issues.

docker+docker-bridge and docker+CNI are working better in most state changes so we can test against them already and identify behavioral differences.

Ignite run > stop > start > stop > start

sudo bin/ignite run weaveworks/ignite-ubuntu --ssh --name my-vm
INFO[0001] Created VM with ID "98da4c9faf220af1" and name "my-vm"
INFO[0001] Networking is handled by "cni"
INFO[0001] Started Firecracker VM "98da4c9faf220af1" in a container with ID "ignite-98da4c9faf220af1"
INFO[0002] Waiting for the ssh daemon within the VM to start...

sudo bin/ignite stop my-vm
INFO[0000] Removing the container with ID "ignite-98da4c9faf220af1" from the "cni" network
INFO[0001] Stopped VM with name "my-vm" and ID "98da4c9faf220af1"

sudo bin/ignite start my-vm
INFO[0000] Networking is handled by "cni"
INFO[0000] Started Firecracker VM "98da4c9faf220af1" in a container with ID "ignite-98da4c9faf220af1"
FATA[0010] timeout waiting for ignite-spawn startup

sudo bin/ignite stop my-vm
WARN[0000] VM "98da4c9faf220af1" is not running but trying to cleanup networking for stopped container
INFO[0000] Removing the container with ID "ignite-98da4c9faf220af1" from the "cni" network
WARN[0000] Failed to cleanup networking for stopped container VM "98da4c9faf220af1": failed to Statfs "/proc/5765/ns/net": no such file or directory
FATA[0000] failed to Statfs "/proc/5765/ns/net": no such file or directory

sudo bin/ignite start my-vm
ERRO[0000] failed to setup network for namespace "ignite-98da4c9faf220af1": failed to allocate for range 0: 10.61.0.21 has been allocated to ignite-98da4c9faf220af1, duplicate allocation is not allowed
FATA[0000] failed to allocate for range 0: 10.61.0.21 has been allocated to ignite-98da4c9faf220af1, duplicate allocation is not allowed

Out-of-band/VM-internal reboot

sudo bin/ignite run weaveworks/ignite-ubuntu --ssh --name test-reboot2

sudo bin/ignite exec test-reboot2 echo hi
hi

sudo bin/ignite exec test-reboot2 reboot

sudo bin/ignite exec test-reboot2 echo down
FATA[0000] VM "dfed6c8f745a1833" is not running

sudo bin/ignite vm start test-reboot2
ERRO[0000] failed to setup network for namespace "ignite-dfed6c8f745a1833": failed to allocate for range 0: 10.61.0.19 has been allocated to ignite-dfed6c8f745a1833, duplicate allocation is not allowed
FATA[0000] failed to allocate for range 0: 10.61.0.19 has been allocated to ignite-dfed6c8f745a1833, duplicate allocation is not allowed

sudo bin/ignite vm stop test-reboot2
WARN[0000] VM "dfed6c8f745a1833" is not running but trying to cleanup networking for stopped container
INFO[0000] Removing the container with ID "ignite-dfed6c8f745a1833" from the "cni" network

sudo bin/ignite vm start test-reboot2
FATA[0000] failed to start container for VM "dfed6c8f745a1833": task must be stopped before deletion: running: failed precondition

sudo bin/ignite stop test-reboot2
WARN[0000] VM "dfed6c8f745a1833" is not running but trying to cleanup networking for stopped container
INFO[0000] Removing the container with ID "ignite-dfed6c8f745a1833" from the "cni" network
WARN[0000] Failed to cleanup networking for stopped container VM "dfed6c8f745a1833": failed to Statfs "/proc/5021/ns/net": no such file or directory
FATA[0000] failed to Statfs "/proc/5021/ns/net": no such file or directory

sudo bin/ignite start test-reboot2
INFO[0000] Networking is handled by "cni"
INFO[0000] Started Firecracker VM "dfed6c8f745a1833" in a container with ID "ignite-dfed6c8f745a1833"
FATA[0010] timeout waiting for ignite-spawn startup

@stealthybox stealthybox added area/runtime Issues related to container runtimes priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/networking Issues related to networking labels Aug 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/networking Issues related to networking area/runtime Issues related to container runtimes kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

3 participants