OCPBUGS-55638: OCPBUGS-55637: Add admin_acks handling in the MCO#5027
Conversation
|
@djoshy: This pull request references Jira Issue OCPBUGS-55638, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0bbd233 to
eaebaed
Compare
eaebaed to
910ea66
Compare
910ea66 to
157c091
Compare
|
/retest-required |
1 similar comment
|
/retest-required |
yuqi-zhang
left a comment
There was a problem hiding this comment.
Generally lgtm, will of course need to wait for the docs to exist before we can merge
Some minor questions inline, but I think they're mostly clear
| BootImageAWSGCPMsg = "GCP or AWS platform with no boot image configuration detected. OCP will automatically opt-in all GCP and AWS clusters that currently do not a boot image configuration in 4.19. Please add a configuration to disable boot image updates if this is not desired. See [insert-doc-link] " | ||
|
|
||
| AArch64BootImageKey = "ack-4.18-aarch64-bootloader-4.19" | ||
| AArch64BootImageMsg = "aarch64 nodes detected. Please ensure boot image are updated by following the KCS:(insertlink) prior to upgrading to 4.19." |
There was a problem hiding this comment.
nit: updated sounds like it's an action that always needs to be taken, maybe we should say something like "ensure boot image is not too old" instead? (I couldn't think of a better way to phrase it)
There was a problem hiding this comment.
Yeah, that makes sense to me! Perhaps we should run the messages through a doc/Trevor review, they might have better insight/worked on this before
| } | ||
|
|
||
| // If no machine managers exist, no configuration has been defined | ||
| return mcop.Spec.ManagedBootImages.MachineManagers == nil, nil |
There was a problem hiding this comment.
So this is saying that if the user already has opted in in a previous version, we don't raise the admin ack?
(makes sense, just thinking out loud)
There was a problem hiding this comment.
Not necessarily opt-ed in, but if they have any sort of opinion(all/none/partial), we'll either remove the admin-ack if one exists, or otherwise a no-op
| // updateGuardKeyIfNeeded adds a key with a message depending on: | ||
| // If guard is needed and key doesn't exist => create it | ||
| // If guard is needed and key exists => nothing to do | ||
| // If guard is not needed and key exists => delete it |
There was a problem hiding this comment.
This is interesting, in the sense that we do clear stale admin acks, but only before you do the upgrade, since these syncs will be gone after 4.18.
(again, this is fine, just thinking out loud)
There was a problem hiding this comment.
Right, so I wanted the user to able to clear the gate either by addressing it via the admin-ack configmap or via defining an boot image opinion. In 4.19, this should not have any effect per Trevor since these keys only target 4.18(but we can double check that again)
| adminCM: buildAdminAckConfigMapWithData(nil), | ||
| }, | ||
| { | ||
| name: "AWS platform, with a boot image configuration", |
There was a problem hiding this comment.
These tests all have a "disabled" boot image configuration (empty machinemanagers), and thereby do not need the admin ack, right?
There was a problem hiding this comment.
Right, disabled meaning the user wanted to opt out so no need to raise the alarm. I can add comments/make this more verbose if its not very clear in the current state 😄
|
/retest-required |
yuqi-zhang
left a comment
There was a problem hiding this comment.
Leaving an approval for the code, but should wait until docs are in place to merge
157c091 to
9f5367d
Compare
|
Updated to include doc links.
|
|
Pre-merge verified : Edit the oc edit cm admin-acks -n openshift-config -o yaml
apiVersion: v1
data:
ack-4.18-kube-1.32-api-removals-in-4.19: "true"
kind: ConfigMap
metadata:
annotations:
include.release.openshift.io/hypershift: "true"
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
kubernetes.io/description: Record administrator acknowledgments of update gates
defined in the admin-gates ConfigMap in the openshift-config-managed namespace.
release.openshift.io/create-only: "true"
creationTimestamp: "2025-05-16T09:20:54Z"
name: admin-acks
namespace: openshift-config
resourceVersion: "36986"
uid: 167b0343-a43b-4e6d-b948-fa8c5a22cccc
Able to see After edit as far now able to see the |
|
/label backport-risk-assessed Not an actual backport |
9f5367d to
a68e58b
Compare
|
/unhold |
|
/label cherry-pick-approved |
|
/retest-required |
|
/override ci/prow/e2e-gcp-op This PR has passed this test in the past, and it seems to timing out in the deprovision stage |
|
@djoshy: Overrode contexts on behalf of djoshy: ci/prow/e2e-gcp-op DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@djoshy: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: djoshy, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7c89511
into
openshift:release-4.18
|
@djoshy: Jira Issue OCPBUGS-55638: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-55638 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[ART PR BUILD NOTIFIER] Distgit: ose-machine-config-operator |
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
…to-4.21 OCI mirror requirement Similar to [1]'s boot-image handling, but in the machine-config controller that handles mirror management this time, instead of 4.18's guard which lived in the machine-config operator. [1]: openshift#5027
- What I did
I added a new sync loop in the operator for the
admin-gatesconfigmap in theopenshift-config-managednamespace. This sync loop will now begin to add the following keys:ack-4.18-boot-image-opt-out-in-4.19when a AWS/GCP cluster does not have a boot image configuration.ack-4.18-aarch64-bootloader-4.19when an aarch64/arm64 node is detected in the cluster.These gates are only relevant for upgrades to 4.19, so they will only need to exist in the MCO's
release-4.18branch. I've also added a few units for this functionality.- How to verify it
Before testing either case, ensure that any non MCO keys in
admin-gateshave been acknowledged. For example, if there is a key calledack-4.18-kube-1.32-api-removals-in-4.19inadmin-gates, add this key to theadmin-acksconfigmap in theopenshift-confignamespace with its value set to true like so:Upgradeable=False(check viaoc get clusterversion -o yaml). Now, add a boot image configuration - the CVO should no longer be settingUpgradeableto false.Upgradeable=False(check viaoc get clusterversion -o yaml).