Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Use Update() instead of UpdateStatus()" #11127

Conversation

stevekuznetsov
Copy link
Contributor

This reverts commit 6ec5070.

Signed-off-by: Steve Kuznetsov skuznets@redhat.com

fixes #11093

/assign @cjwagner @fejta @munnerz
/cc @BenTheElder @krzyzacy

This reverts commit 6ec5070.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 5, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stevekuznetsov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/prow Issues or PRs related to prow area/prow/bump Updates to the k8s prow cluster area/prow/jenkins-operator Issues or PRs related to prow's jenkins-operator component area/prow/plank Issues or PRs related to prow's plank component sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 5, 2019
@fejta
Copy link
Contributor

fejta commented Feb 5, 2019

Doesn't this require 1.13?

@stevekuznetsov
Copy link
Contributor Author

No, I require a basic amount of reading comprehension 🤦‍♂️

The page just said 1.13 had it in beta, not that it transitioned to beta in 1.13

@fejta
Copy link
Contributor

fejta commented Feb 5, 2019

Can you find a doc that lists when it went to beta? I tried but cannot find it. We can't use alpha apis in GKE (or if we do, our cluster gets periodically deleted)

@stevekuznetsov
Copy link
Contributor Author

@munnerz had the links for that

So, some fun facts:

when you update the CRD definition to include a status subresource, update calls will no longer update status at all
when you change to updatestatus without declaring a status subresource, those calls will fail

@nikhita is there a reasonable way to roll out a change like this without downtime?

@fejta
Copy link
Contributor

fejta commented Feb 6, 2019

do both an Update and UpdateStatus call?

@nikhita
Copy link
Member

nikhita commented Feb 6, 2019

Can you find a doc that lists when it went to beta?

I can confirm that this has been in beta from 1.11. :)

(Though I agree it'd be neat to show since when it has been in beta in the official docs)

@stevekuznetsov Re Update and UpdateStatus, the (breaking) change in behaviour is expected because the controller should be able to write the status, and only read the spec and users should be able to write the spec, but only read the status.

I would argue that if the controller is also updating the spec or vice-versa, then that behaviour should be fixed. :)

@stevekuznetsov
Copy link
Contributor Author

@nikhita the breaking issue is as such:

  • today, controller calls Update() and intends to update /status
  • today, CRD has no /status subresource

I see these timelines:


  1. update controllers to call UpdateStatus()
  2. update CRD to have /status subresource

This fails as the controllers will error on their UpdateStatus() calls while the CRD has no /status subresource


  1. update CRD to have /status subresource
  2. update controllers to call UpdateStatus()

This fails as the controllers will hot-loop on their Update() calls while they cannot affect the CRD's /status field. Our controllers for instance will attempt to abort every single ProwJob in the cluster in a hot-loop.

@munnerz
Copy link
Member

munnerz commented Feb 6, 2019

Our controllers for instance will attempt to abort every single ProwJob in the cluster in a hot-loop.

Will the controller's rate limiters not kick in and start backing-off processing these items?

It's not really an ideal solution, but at least if you update the CRDs first, you will have the controllers returning errors (and thus rate limiting) whereas with the first strategy, the call to Update() will just silently throw away the status information.

Very much interested in the resolution here, as I want to move cert-manager to use /status too and need to somehow manage this as part of our upgrade process 😅

@stevekuznetsov
Copy link
Contributor Author

Our "controllers" are just for loops so there are no rate limiters 🤦‍♂️

@stevekuznetsov
Copy link
Contributor Author

If you version the CRD can you add the /status only to a newer version?

@munnerz
Copy link
Member

munnerz commented Feb 6, 2019

Looks like you might be able to.. kubectl explain crd.spec.versions.subresources:

KIND:     CustomResourceDefinition
VERSION:  apiextensions.k8s.io/v1beta1

RESOURCE: subresources <Object>

DESCRIPTION:
     Subresources describes the subresources for CustomResource Top-level and
     per-version subresources are mutually exclusive. Per-version subresources
     must not all be set to identical values (top-level subresources should be
     used instead) This field is alpha-level and is only honored by servers that
     enable the CustomResourceWebhookConversion feature.

     CustomResourceSubresources defines the status and scale subresources for
     CustomResources.

FIELDS:
   scale	<Object>
     Scale denotes the scale subresource for CustomResources

   status	<>
     Status denotes the status subresource for CustomResources

@munnerz
Copy link
Member

munnerz commented Feb 6, 2019

This field is alpha-level and is only honored by servers that enable the CustomResourceWebhookConversion feature.

🤔 not sure if versioning for subresources is available as far back as 1.11 however...

@fejta
Copy link
Contributor

fejta commented Feb 6, 2019

Seems pretty clear here:

  1. Deploy a vesion which calls both UpdateStatus and Update, set this flag to default
  2. Update CRD to have a status field
  3. Tune flag to disable the update call
  4. Remove the logic to call Update in 6mo

@stevekuznetsov
Copy link
Contributor Author

/hold

Can tackle later

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2019
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 9, 2019
@k8s-ci-robot
Copy link
Contributor

@stevekuznetsov: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@stevekuznetsov: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-test-infra-lint a1bc15a link /test pull-test-infra-lint

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@stevekuznetsov
Copy link
Contributor Author

Not super urgent and high risk for disruption

/close

@k8s-ci-robot
Copy link
Contributor

@stevekuznetsov: Closed this PR.

In response to this:

Not super urgent and high risk for disruption

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/prow/bump Updates to the k8s prow cluster area/prow/jenkins-operator Issues or PRs related to prow's jenkins-operator component area/prow/plank Issues or PRs related to prow's plank component area/prow Issues or PRs related to prow cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

controllers: use UpdateStatus()
6 participants