Description
openedon Aug 19, 2024
Specifically this bit:
omicron/nexus/src/app/instance.rs
Lines 781 to 813 in 6dd9802
This match is a little hard to parse. At this point, Nexus already believes the instance of interest has a running Propolis and is just deciding whether to send a state change request there for disposition. I think the general idea should be to say something like
- If this is a request to start, reboot, or stop, and the Propolis is in a non-terminal state (i.e. it's not Failed or Destroyed), and it doesn't look like we're migrating, forward the request to Propolis. (We don't currently transmit the state change request queue from source to target during a migration, so allowing a reboot/stop request on a migrating instance might result in a request being queued to the source that won't be picked up by the target.)
- If this is a request to start Propolis via migration in, and the target Propolis hasn't reported that it's started migrating yet, permit the request; otherwise the VMM has already started and there's nothing left to do
For start/stop/reboot, this is pretty close to what we have today, but the handling for requests to migrate could stand to be tightened up a bit.
It's also worth noting that this match is in a path where we already know that we've got an active Propolis. We should consider whether it'd be clearer to look at the VMM's state directly instead of looking at it as interpreted through InstanceAndActiveVmm::determine_effective_state
.