Handling SIGINT and sequential deployments #2520

andrewg-xyz · 2024-05-17T21:58:36Z

Environment

Device and OS: M2 Max, OSX Sonoma 14.4.1
App version: v0.33.0
Kubernetes distro being used: k3d (uds deploy k3d-core-slim-dev:0.21.1)
Other:

Steps to reproduce

zarf package deploy ... My app has an error that fails deployment "0 out of 1 expected pods are ready"
SIGINT (ctrl+c) to stop the zarf package deploy
(make updates, rebuild zarf package) and run zarf package deploy

Expected result

The interruption is handled and future zarf package deploy function.

Actual Result

My cluster has a helm CR with a status of pending-install Which prevents sequential zarf package deploy-ments of my package (once I've fixed the error)

Sequential deploys fail/retry with WARNING Retrying (1/3) in 5s: unable to complete the helm chart install/upgrade: another operation (install/upgrade/rollback) is in progress

Visual Proof (screenshots, videos, text, etc)

Severity/Priority

Additional Context

workaround is to manually remove the helm CR and clean up the failed deployment.

The text was updated successfully, but these errors were encountered:

phillebaba · 2024-05-20T07:59:46Z

This should be covered by #2505. It is good however to have a specific example of why this is needed.

## Description Right now different functions handle interrupts in different ways. Some will register their own signal listeners while others will use the global signal handler that exits the program. This means that contexts are never cancelled giving the process time to shut down and clean up. This change adds a signal handler to the Cobra parent context and removes the other signal handlers. Now all commands will use the same signal handler. ## Related Issue Fixes #2505 Relates to #2520 ## Checklist before merging - [x] Test, docs, adr added or updated as needed - [x] [Contributor Guide Steps](https://github.com/defenseunicorns/zarf/blob/main/.github/CONTRIBUTING.md#developer-workflow) followed Co-authored-by: Austin Abro <37223396+AustinAbro321@users.noreply.github.com>

lucasrod16 · 2024-06-17T19:29:00Z

@andrewg-xyz I am not sure there is a way Zarf can reliably guarantee cleanup of resources when a deployment is canceled with a SIGINT. Zarf uses the Helm SDK to cleanup resources on a best effort basis for failed upgrades and rollbacks, but it is risky for Zarf to start removing resources from the cluster outside of the Helm SDK on a failed deployment in my opinion.

I would recommend running zarf package remove to cleanup the failed deployment before attempting re-deployment

andrewg-xyz · 2024-06-20T18:23:33Z

Thanks - I'm good to close with the context. I'll keep in mind as I interact with similar situations.

## Description Right now different functions handle interrupts in different ways. Some will register their own signal listeners while others will use the global signal handler that exits the program. This means that contexts are never cancelled giving the process time to shut down and clean up. This change adds a signal handler to the Cobra parent context and removes the other signal handlers. Now all commands will use the same signal handler. ## Related Issue Fixes #2505 Relates to #2520 ## Checklist before merging - [x] Test, docs, adr added or updated as needed - [x] [Contributor Guide Steps](https://github.com/defenseunicorns/zarf/blob/main/.github/CONTRIBUTING.md#developer-workflow) followed Co-authored-by: Austin Abro <37223396+AustinAbro321@users.noreply.github.com> Signed-off-by: Austin Abro <AustinAbro321@gmail.com>

andrewg-xyz added the possible-bug 🐛 label May 17, 2024

phillebaba mentioned this issue May 31, 2024

fix: cancel Cobra parent context on interrupt #2567

Merged

2 tasks

andrewg-xyz closed this as completed Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling SIGINT and sequential deployments #2520

Handling SIGINT and sequential deployments #2520

andrewg-xyz commented May 17, 2024

phillebaba commented May 20, 2024

lucasrod16 commented Jun 17, 2024

andrewg-xyz commented Jun 20, 2024

Handling SIGINT and sequential deployments #2520

Handling SIGINT and sequential deployments #2520

Comments

andrewg-xyz commented May 17, 2024

Environment

Steps to reproduce

Expected result

Actual Result

Visual Proof (screenshots, videos, text, etc)

Severity/Priority

Additional Context

phillebaba commented May 20, 2024

lucasrod16 commented Jun 17, 2024

andrewg-xyz commented Jun 20, 2024