-
Notifications
You must be signed in to change notification settings - Fork 200
fix(deploy): track success/failed chart install status #4172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
✅ Deploy Preview for zarf-docs canceled.
|
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will solve an important pain point for users. A few initial comments and suggestions
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
existingCharts []InstalledChart | ||
installedCharts []InstalledChart | ||
expectedCharts []InstalledChart | ||
}{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit but I think partial should be a field on this table test as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair - partial
has no functional purpose yet - but wanted to set us up for prune without having to make a breaking change.
lookup starts with the existing state and then adds each chart that was installed by the component. For prune - partial represents that we won't want to mark a release as untracked because deploy may have failed before it was deployed.
Open to removing it - but low overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think these changes look good, but I'd like to see an e2e test in 41_remove_test.go for this behavior
Signed-off-by: Brandt Keller <brandt.keller@defenseunicorns.com>
Added a test - great call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should make a core user flow much improved, nice work. Reminder to remove the tracking info from the PR description
Description
This PR adds state tracking for component Installed Charts.
There is currently a gap in the deploy lifecycle where a user attempts to deploy a package - where a component chart subsequently fails to become healthy - and the deploy fails.
Chart State
This leaves users in a state where If this is an initial deploy - no installed charts are recorded for the component - and thus they cannot be removed with
zarf package remove
. You then have to cleanup resources with helm (not the embedded version) or kubectl (which is messy). This is a very common process for package development.Installed Chart state is not strictly required to solve this - but knowing which of the charts of a component failed on subsequent updates - outside of the CLI logs - could be a future value add.
Context Cancellation
When the scenario highlighted above occurs - developers may not want to wait for the timeout (up to multiple minutes) to occur. Given a common scenario -
errimagepull
occurs - the package developer will very quickly realize that they need to add a missing or incorrect image definition to a component. They (at least myself) will reach forctrl+c
to cancel deploy.The above changes for tracking failure will not handle this case natively. The context cancellation will propagate to the client-go operations and will not perform the recording in the package secret.
This PR additionally includes context handling to ensure the recording can finish before exiting gracefully.
Related Issue
Fixes #4171
Checklist before merging