-
Notifications
You must be signed in to change notification settings - Fork 425
OCPBUGS-55638: OCPBUGS-55637: Add admin_acks handling in the MCO #5027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-4.18
Are you sure you want to change the base?
Conversation
@djoshy: This pull request references Jira Issue OCPBUGS-55638, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0bbd233
to
eaebaed
Compare
eaebaed
to
910ea66
Compare
910ea66
to
157c091
Compare
/retest-required |
1 similar comment
/retest-required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally lgtm, will of course need to wait for the docs to exist before we can merge
Some minor questions inline, but I think they're mostly clear
BootImageAWSGCPMsg = "GCP or AWS platform with no boot image configuration detected. OCP will automatically opt-in all GCP and AWS clusters that currently do not a boot image configuration in 4.19. Please add a configuration to disable boot image updates if this is not desired. See [insert-doc-link] " | ||
|
||
AArch64BootImageKey = "ack-4.18-aarch64-bootloader-4.19" | ||
AArch64BootImageMsg = "aarch64 nodes detected. Please ensure boot image are updated by following the KCS:(insertlink) prior to upgrading to 4.19." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: updated
sounds like it's an action that always needs to be taken, maybe we should say something like "ensure boot image is not too old" instead? (I couldn't think of a better way to phrase it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense to me! Perhaps we should run the messages through a doc/Trevor review, they might have better insight/worked on this before
} | ||
|
||
// If no machine managers exist, no configuration has been defined | ||
return mcop.Spec.ManagedBootImages.MachineManagers == nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is saying that if the user already has opted in in a previous version, we don't raise the admin ack?
(makes sense, just thinking out loud)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily opt-ed in, but if they have any sort of opinion(all/none/partial), we'll either remove the admin-ack if one exists, or otherwise a no-op
// updateGuardKeyIfNeeded adds a key with a message depending on: | ||
// If guard is needed and key doesn't exist => create it | ||
// If guard is needed and key exists => nothing to do | ||
// If guard is not needed and key exists => delete it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, in the sense that we do clear stale admin acks, but only before you do the upgrade, since these syncs will be gone after 4.18.
(again, this is fine, just thinking out loud)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so I wanted the user to able to clear the gate either by addressing it via the admin-ack
configmap or via defining an boot image opinion. In 4.19, this should not have any effect per Trevor since these keys only target 4.18(but we can double check that again)
adminCM: buildAdminAckConfigMapWithData(nil), | ||
}, | ||
{ | ||
name: "AWS platform, with a boot image configuration", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests all have a "disabled" boot image configuration (empty machinemanagers), and thereby do not need the admin ack, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, disabled meaning the user wanted to opt out so no need to raise the alarm. I can add comments/make this more verbose if its not very clear in the current state 😄
/retest-required |
@djoshy: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving an approval for the code, but should wait until docs are in place to merge
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: djoshy, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
- What I did
I added a new sync loop in the operator for the
admin-gates
configmap. This sync loop will now begin to add the following keys:ack-4.18-boot-image-opt-out-in-4.19
when a AWS/GCP cluster does not have a boot image configuration.ack-4.18-aarch64-bootloader-4.19
when an aarch64/arm64 node is detected in the cluster.These gates are only relevant for upgrades to 4.19, so they will only need to exist in the MCO's
release-4.18
branch. I've also added a few units for this functionality.- How to verify it
Upgradeable=False
(check viaoc get clusterversion -o yaml
). Now, add a boot image configuration - this should setUpgradeable
back to true.Upgradeable=False
(check viaoc get clusterversion -o yaml
).