Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown hcsshim properly #1289

Merged
merged 3 commits into from
Feb 4, 2022
Merged

Shutdown hcsshim properly #1289

merged 3 commits into from
Feb 4, 2022

Conversation

helsaawy
Copy link
Contributor

@helsaawy helsaawy commented Feb 2, 2022

Currently, when a Shutdown request is received, service calls os.Exit to forcefully exits the binary without cleaning up resources and IO channels, ending spans, or flushing logs. Primarily this prevents logging of shim-wide or long running spans but can also leak un-closed system resources.

For reference, the runc shim within containerd does not respect the ShutdownRequest.Now parameter, and calls several cleanup callbacks instead of exiting immediately via os.Exit

Added .Done() and .IsShutdown() methods to service to signal that a service shutdown request from containerd for the init task was received, and updated the serve action to wait on a shutdown request to close the ttrpc servers and pipes.

Added NewService method and creation options to properly initialize the service struct, namely to create the internal channel to signal shutdown.

Added tests for shutdownInternal.

Signed-off-by: Hamza El-Saawy hamzaelsaawy@microsoft.com

Currently, Shutdown requests forcefully exits the binary without
cleaning up resources and IO channels, or flushing logs.

Added `.Done()` and `.IsShutdown()` methods to service watch for
service shutdown requests from containerd, and appropriately close
background servers and go routines.

Added `NewService` method and creation options to properly initialize
the `service` struct.

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
@helsaawy helsaawy requested a review from a team as a code owner February 2, 2022 22:38
Copy link
Contributor

@jterry75 jterry75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change. Thanks

cmd/containerd-shim-runhcs-v1/serve.go Outdated Show resolved Hide resolved
ttrp.Shutdown( has a 200ms ticker, not a timeout.
Adding a proper timeout in case shutdown takes too long.

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
@jterry75
Copy link
Contributor

jterry75 commented Feb 3, 2022

Looks good!

@dcantah dcantah self-assigned this Feb 3, 2022
Checking return value of `shutdownInternal` for cleanup in service
tests.

Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Copy link
Contributor

@ambarve ambarve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@helsaawy helsaawy merged commit e382e6d into microsoft:master Feb 4, 2022
@helsaawy helsaawy deleted the he/shutdown branch February 4, 2022 16:39
@helsaawy helsaawy mentioned this pull request Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants