Add WSLAContainer flag to auto delete container after exit#14196
Add WSLAContainer flag to auto delete container after exit#14196kvega005 wants to merge 6 commits intomicrosoft:feature/wsl-for-appsfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds Docker-style auto-remove behavior for WSLA containers by persisting an Rm-style flag in container metadata and triggering container deletion automatically after the container stops.
Changes:
- Persist/restore
WSLAContainerFlagsinWSLAContainerMetadataV1so flags survive service/session restarts. - Implement auto-delete on container stop events when
WSLAContainerFlagsRmis set. - Add a new Windows test covering auto-remove behavior and flag persistence across sessions.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| test/windows/WSLATests.cpp | Adds ContainerAutoRemove test validating async deletion and persistence across sessions. |
| src/windows/wslaservice/exe/WSLAContainerMetadata.h | Extends persisted container metadata to include Flags. |
| src/windows/wslaservice/exe/WSLAContainer.cpp | Persists flags into metadata on create/open and triggers auto-delete on stop events. |
| // A container with the Rm flag is deleted asynchronously after it's stopped. | ||
| // This retry loop verifies the container is deleted within one second. | ||
| wsl::shared::retry::RetryWithTimeout<void>( | ||
| [&]() { | ||
| wil::com_ptr<IWSLAContainer> container; | ||
| auto hr = m_defaultSession->OpenContainer(containerName, &container); | ||
| THROW_HR_IF(E_FAIL, hr != HRESULT_FROM_WIN32(ERROR_NOT_FOUND)); | ||
| }, | ||
| std::chrono::milliseconds{100}, | ||
| std::chrono::seconds{1}); |
There was a problem hiding this comment.
verifyContainerDeleted only waits up to 1 second for async removal. Container deletion can legitimately take longer on busy/slow test machines (Docker latency, service scheduling), making this test flaky. Consider using a longer timeout consistent with other container/process waits in this file (e.g., 30s) while keeping the poll interval small.
There was a problem hiding this comment.
Agree I'd recommend increasing the timeout to a minute so it doesn't risk failing in CI (which can pretty slow sometimes)
| } | ||
|
|
||
| // Unmount volumes. | ||
| UnmountVolumes(m_mountedVolumes, *m_parentVM); |
There was a problem hiding this comment.
WSLAContainerImpl::Delete() already calls UnmountVolumes(m_mountedVolumes, *m_parentVM). Adding an unconditional UnmountVolumes(...) in the destructor will attempt to unmount the same paths again after a successful delete, which can generate noisy LOG_IF_FAILED telemetry (and may fail if the mount is already gone). Consider making unmount idempotent by clearing m_mountedVolumes after a successful unmount, or only unmounting in one place (Delete vs destructor) guarded by state.
| UnmountVolumes(m_mountedVolumes, *m_parentVM); | |
| UnmountVolumes(m_mountedVolumes, *m_parentVM); | |
| m_mountedVolumes.clear(); |
There was a problem hiding this comment.
Agreed with this. Do we need to explicitly unmount in the destructor ? We should only need to do it when it's deleted, which Delete() already does
There was a problem hiding this comment.
We do not. this is a remnant of when I was trying to use the docker auto-remove flag. In that case the container could be deleted without going through Delete(), but I ultimately decided against that approach for a few different reasons. I just forgot to remove this.
| // If the Rm flag is set, delete the container. | ||
| // This needs to be done in a detached thread because it will trigger deletion of this object. | ||
| if (WI_IsFlagSet(m_containerFlags, WSLAContainerFlagsRm)) | ||
| { | ||
| auto comWrapper = m_comWrapper; | ||
| std::thread([comWrapper]() { LOG_IF_FAILED(comWrapper->Delete()); }).detach(); | ||
| } |
There was a problem hiding this comment.
Auto-remove spawns a new detached std::thread per container stop. This can become an unbounded thread creation path under load and there's no shutdown coordination/join, which is risky for a service. Prefer queueing the delete onto an existing worker mechanism (e.g., Windows threadpool work item / a shared service work queue) instead of creating detached threads.
| if (WI_IsFlagSet(m_containerFlags, WSLAContainerFlagsRm)) | ||
| { | ||
| auto comWrapper = m_comWrapper; | ||
| std::thread([comWrapper]() { LOG_IF_FAILED(comWrapper->Delete()); }).detach(); |
There was a problem hiding this comment.
This is definitely tricky because we're indeed inside the object that we want to delete.
Here's an idea that might help simplify this:
Instead of registering the event callback in WSLAContainerImpl, we register the callback at the WSLAContainer level, and then we have something like this:
void WSLAContainerImpl::OnEvent(ContainerEvent event, std::optional<int> exitCode)
{
if (event == ContainerEvent::Stop)
{
auto [lock, impl] = LockImpl();
impl->OnStopped();
if (WI_IsFlagSet(impl->Flags(), WSLAContainerFlagsRm))
{
impl->Delete();
m_onDeleted(impl);
}
}
}
We'd have to add a OnStopped() & Flags() methods, but that should allow us to completely avoid the "I need to delete myself" problem
There was a problem hiding this comment.
Had to do a bit of refactoring to avoid lock ordering issues and potential deadlocks. The OnDeleted callback now returns ownership of the WSLAContainerImpl. This needs to be destroyed outside of the WSLAContainer lock.
| TraceLoggingValue(impl->ID().c_str(), "Id"), | ||
| TraceLoggingValue((int)event, "Event")); | ||
|
|
||
| if (event == ContainerEvent::Stop) |
There was a problem hiding this comment.
nit: I'd recommend doing an moving the logging and this check in the beginning of this method so we can skip acquiring the lock if this isn't a stop event
There was a problem hiding this comment.
I was thinking about this last night. Right now we need to acquire the lock for logging as well. But we can store the name and ID in the com wrapper, so we don't need to call out to the implementation for logging.
| } | ||
|
|
||
| // Unmount volumes. | ||
| UnmountVolumes(m_mountedVolumes, *m_parentVM); |
There was a problem hiding this comment.
Agreed with this. Do we need to explicitly unmount in the destructor ? We should only need to do it when it's deleted, which Delete() already does
| // A container with the Rm flag is deleted asynchronously after it's stopped. | ||
| // This retry loop verifies the container is deleted within one second. | ||
| wsl::shared::retry::RetryWithTimeout<void>( | ||
| [&]() { | ||
| wil::com_ptr<IWSLAContainer> container; | ||
| auto hr = m_defaultSession->OpenContainer(containerName, &container); | ||
| THROW_HR_IF(E_FAIL, hr != HRESULT_FROM_WIN32(ERROR_NOT_FOUND)); | ||
| }, | ||
| std::chrono::milliseconds{100}, | ||
| std::chrono::seconds{1}); |
There was a problem hiding this comment.
Agree I'd recommend increasing the timeout to a minute so it doesn't risk failing in CI (which can pretty slow sometimes)
| auto process = container.GetInitProcess(); | ||
|
|
||
| VERIFY_ARE_EQUAL(container.State(), WslaContainerStateRunning); | ||
| VERIFY_SUCCEEDED(container.Get().Stop(WSLASignalSIGKILL, 0)); |
There was a problem hiding this comment.
nit: It would be nice to have special logic in WSLAContainer() to synchronously perform the container deleting if the rm flag is set, that way the caller is guaranteed that the container is deleted when Stop() returns successfully (feel free to do this in a future change though)
There was a problem hiding this comment.
I did think about that earlier and ran into the problem of impl deleting itself, but now that we are implementing some of this in the com wrapper, I can revisit.
| ResetTestSession(); | ||
| } | ||
|
|
||
| auto container = OpenContainer(m_defaultSession.get(), "test-auto-remove"); |
There was a problem hiding this comment.
We might need to add the container flags to ContainerInfo for the flags to be correctly re-populated when we recover containers from storage
There was a problem hiding this comment.
We store the process and container flags in the container metadata. I figured in the future we might have more info in these flags that would be important to recover but are not necessarily important to docker.
Summary of the Pull Request
Add support for Docker-style auto-remove (Rm) flag for containers
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed