Skip to content

Commit

Permalink
fix: Ensure to exit with error if any vcpu exits with error
Browse files Browse the repository at this point in the history
Previously, when a VM exited, we looked for the first vcpu that reported
an exit status, and then indiscriminately reported that back. However,
it is possible to one vcpu to exit successfully while another exits with
an error, and this could lead us to report "Firecracker Exited
Successfully" even though it did not.

Now, we explicitly look for the status code of all vcpus. If any of them
report an error, this takes precedence over non-error status codes.

Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
  • Loading branch information
roypat committed Nov 24, 2023
1 parent 885bbc7 commit 7660a59
Showing 1 changed file with 20 additions and 16 deletions.
36 changes: 20 additions & 16 deletions src/vmm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ pub mod vstate;
use std::collections::HashMap;
use std::io;
use std::os::unix::io::AsRawFd;
use std::sync::mpsc::{RecvTimeoutError, TryRecvError};
use std::sync::mpsc::RecvTimeoutError;
use std::sync::{Arc, Barrier, Mutex};
use std::time::Duration;

Expand Down Expand Up @@ -915,23 +915,27 @@ impl MutEventSubscriber for Vmm {
// Exit event handling should never do anything more than call 'self.stop()'.
let _ = self.vcpus_exit_evt.read();

let mut exit_code = None;
// Query each vcpu for their exit_code.
for handle in &self.vcpus_handles {
match handle.response_receiver().try_recv() {
Ok(VcpuResponse::Exited(status)) => {
exit_code = Some(status);
// Just use the first encountered exit-code.
break;
}
Ok(_response) => {} // Don't care about these, we are exiting.
Err(TryRecvError::Empty) => {} // Nothing pending in channel
Err(err) => {
panic!("Error while looking for VCPU exit status: {}", err);
let exit_code = 'exit_code: {
// Query each vcpu for their exit_code.
for handle in &self.vcpus_handles {
// Drain all vcpu responses that are pending from this vcpu until we find an
// exit status.
for response in handle.response_receiver().try_iter() {
if let VcpuResponse::Exited(status) = response {
// It could be that some vcpus exited successfully while others
// errored out. Thus make sure that error exits from one vcpu always
// takes precedence over "ok" exits
if status != FcExitCode::Ok {
break 'exit_code status;
}
}
}
}
}
self.stop(exit_code.unwrap_or(FcExitCode::Ok));

// No CPUs exited with error status code, report "Ok"
FcExitCode::Ok
};
self.stop(exit_code);
} else {
error!("Spurious EventManager event for handler: Vmm");
}
Expand Down

0 comments on commit 7660a59

Please sign in to comment.