Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add MachineConfiguration conditions for boot image updates #1882

Conversation

djoshy
Copy link
Contributor

@djoshy djoshy commented May 7, 2024

This PR adds a few new MachineConfiguration conditions, which will be used by the MSBIC to inform the user of the results of the boot image reconciliation loop. These conditions will be behind the ManagedBootImages feature gate.

More info:

Copy link
Contributor

openshift-ci bot commented May 7, 2024

Hello @djoshy! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 7, 2024
@openshift-ci openshift-ci bot requested review from JoelSpeed and knobunc May 7, 2024 16:37
@djoshy
Copy link
Contributor Author

djoshy commented May 7, 2024

I do have an open question. If boot image updates are disabled(which is possible via the opt-in mechanism), none of the conditions are exactly applicable. Should I make a specific condition for the "Off" state, and set it to true?

@djoshy djoshy force-pushed the machine-configuration-conditions branch from 35f03f4 to 258c414 Compare May 8, 2024 14:10
@@ -494,3 +494,19 @@ const (
// Special represents an action that is internal to the MCO, and is not allowed in user defined NodeDisruption policies.
SpecialStatusAction NodeDisruptionPolicyStatusActionType = "Special"
)

// These strings will be used to indicate success and failures for MachineConfiguration knobs via conditions.
// Please note: These are subject to change. Both additions and deletions may be made.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost of having these in openshift/api is going to be, "will not be removed, semantically changed, or abandoned". I won't say it's completely impossible to remove them, but the impact to user scripts is pretty extreme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took this bit off, I don't envision us removing this in the future...I shouldn't have stolen this boilerplate from elsewhere :P

// +openshift:enable:FeatureGate=ManagedBootImages
MachineConfigurationBootImageUpdateFailed string = "BootImageUpdateFailed"

// MachineConfigurationBootImageUpdateSuccess means that the MCO was able to successfully update all boot images.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a generation involved here? If a new bootimage is created, this immediately becomes false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've removed the "Success" condition as it is seemed redundant, but for the 2 remainder states the generation of the boot images configmap will be included in every condition update as we discussed

@djoshy
Copy link
Contributor Author

djoshy commented May 8, 2024

Chatted with @deads2k on slack, and we boiled this down to two conditions:

  • BootImageUpdateDegraded. This can be a failure mode in the controller loop or a user mis-configuration. This will flip co/machine-config to degrade.
  • BootImageUpdateProgressing - A boot image update is ongoing. This will flip co/machine-config to progressing.

All of them should include progress counts of the machinesets, and the generation info of the source configmap. If neither of these conditions are true, this means that the controller is in a happy/done state.

@djoshy djoshy force-pushed the machine-configuration-conditions branch from 258c414 to 051406b Compare May 8, 2024 20:37
@djoshy
Copy link
Contributor Author

djoshy commented May 8, 2024

I do have an open question. If boot image updates are disabled(which is possible via the opt-in mechanism), none of the conditions are exactly applicable. Should I make a specific condition for the "Off" state, and set it to true?

Boiling it down to two conditions has helped with this too; as there we are no longer tracking a success state, the other two conditions being false feels valid when the feature is "off" as well.

Also to make a note: The MCO team will consider implementing something like PoolSynchronizerStatus for this in the future

Copy link
Contributor

openshift-ci bot commented May 8, 2024

@djoshy: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@deads2k
Copy link
Contributor

deads2k commented May 9, 2024

/lgtm
/approve

@djoshy
Copy link
Contributor Author

djoshy commented May 9, 2024

/tide refresh

@deads2k deads2k added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 9, 2024
Copy link
Contributor

openshift-ci bot commented May 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, djoshy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit a8a5b79 into openshift:master May 9, 2024
9 checks passed
@djoshy djoshy deleted the machine-configuration-conditions branch October 17, 2024 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants