-
Notifications
You must be signed in to change notification settings - Fork 16
Feature specification for in-place upgrade of Radius #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
61da8dc
to
88314f8
Compare
2. **Fetch available chart versions**: Provide a list of known chart versions so the desired version that the users select is a valid one. | ||
3. **Dry-run** (when requested): Simulate the upgrade, logging steps without making changes. Also making sure that the upgrade will work. Helm has this feature available in the `helm upgrade` command: <https://helm.sh/docs/helm/helm_upgrade/>. | ||
4. **Snapshot**: Automatically back up current data (e.g., etcd, resources in the API server, or Postgres) before making changes. | ||
5. **Upgrade**: Apply necessary Helm changes (including timeouts, set args, etc.), optionally perform database migrations if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we know when to perform database migrations? if we introduce a breaking change to one of our schemas, is there an automatic way to detect and upgrade? probably not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle in progress deployments when an upgrade is initiated?
d9143ab
to
624a1ad
Compare
624a1ad
to
ed05ea2
Compare
|
||
- **Downgrade Support:** | ||
- Should we support downgrading to previous versions? If yes, what are the limitations? | ||
- How should we handle cases where users attempt to downgrade to versions that don't support the upgrade feature itself? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable to only allow downgrades up to the first version that supported upgrades
4. **Snapshot**: Automatically back up current data (e.g., etcd, resources in the API server, or Postgres) before making changes. | ||
5. **Upgrade**: Apply necessary Helm changes (including timeouts, set args, etc.), optionally perform database migrations if needed. | ||
6. **Rollback** (on failure): If something goes wrong, use the snapshot to restore the prior state. | ||
7. **Post-upgrade checks**: Validate that new control plane components are healthy and confirm the upgrade was successful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've gotten some feedback from community users that they might have to roll back the upgrade even after the post-upgrade checks have passed (perhaps their own tests fail in the upgraded version) -- is it possible to do a rollback on the snapshot even after the upgrade successfully completes?
Signed-off-by: ytimocin <ytimocin@microsoft.com>
432f53c
to
82361ea
Compare
Signed-off-by: ytimocin <ytimocin@microsoft.com>
82361ea
to
ee3c7fa
Compare
Signed-off-by: ytimocin <ytimocin@microsoft.com>
0da75a2
to
2f847c2
Compare
- Upgrading the Radius control plane using Helm directly. We can run `helm upgrade` on the Radius Helm installation but that is not going to put all the necessary pieces together for the control plane to work. Making this work is not in the scope of this work. | ||
- Zero-downtime control plane upgrades. While we aim to minimize disruption, guaranteeing absolutely no downtime for control plane components is not a goal for this initial release. | ||
- Automatic CLI upgrades. Users must manually update their local CLI version after upgrading the control plane. | ||
- Direct GitOps workflow integration for version 1. While users who manage Radius through HelmReleases in their GitOps pipeline will be able to update Helm charts, the complete upgrade process (including preflight checks, locking, and health verification) requires the `rad upgrade kubernetes` command in this initial version. Future versions will provide better GitOps integration options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per @willdavsmith this is in scope now
### GitOps Workflow Integration | ||
|
||
In future versions, we plan to enhance GitOps integration to support users who manage Radius through HelmReleases as part of their GitOps workflow. This will include developing a Kubernetes operator that watches for HelmRelease changes and automatically performs the necessary upgrade procedures including preflight checks, locking, and health verification. This integration will allow teams to manage Radius upgrades through their existing GitOps pipelines without requiring manual CLI commands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in scope now, per @willdavsmith
No description provided.