Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paused behaviour is inconsistent #6966

Open
Tracked by #10852
enxebre opened this issue Jul 21, 2022 · 15 comments
Open
Tracked by #10852

Paused behaviour is inconsistent #6966

enxebre opened this issue Jul 21, 2022 · 15 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@enxebre
Copy link
Member

enxebre commented Jul 21, 2022

What steps did you take and what happened:

In a MachineDeployment:
In the predicates we check cluster.Spec.Paused or the MachineDeployment has the annotation. We ignore MachineDeployment.Spec.Paused https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_controller.go#L75-L97.

Then in the reconciling logic we first check the cluster.Spec.Paused or the MachineDeployment has the annotation. We ignore We ignore MachineDeployment.Spec.Paused https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_controller.go#L126-L130

Then lines below we check only the MachineDeployment.Spec.Paused. We ignore cluster.Spec.Paused or the MachineDeployment has the annotation. https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_controller.go#L225-L227

What did you expect to happen:
Always honour .spec.paused and fallback to the annotation for backward compatibility.
Introduce .spec.paused in MachineSets.
Review all CRDs to make the above consistent.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version:
  • minikube/kind version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 21, 2022
@killianmuldoon
Copy link
Contributor

killianmuldoon commented Jul 21, 2022

It looks like the current behaviour of paused is different depending on whether you set the spec field or the annotation? With the annotation you get no reconciliation, but with the spec field we end up calling sync() which has even allows some scaling behaviour? 🤔

Is that specific behaviour (sync when paused) used in some of our workflows - maybe a clusterctl move?

@enxebre
Copy link
Member Author

enxebre commented Jul 21, 2022

/kind cleanup

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Jul 21, 2022
@fabriziopandini fabriziopandini added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jul 29, 2022
@sbueringer
Copy link
Member

Q: If we add a paused field to the MachineSet would we then propagate the value of paused from the MD to the MS? If yes, I assume this shouldn't trigger a rollout? (and we have to take care it doesn't specifically)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2022
@sbueringer
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2022
@fabriziopandini
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Nov 30, 2022
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 19, 2024
@fabriziopandini
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Apr 12, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 11, 2024
@sbueringer
Copy link
Member

Maybe we can fix this with v1beta2

@fabriziopandini
Copy link
Member

@sbueringer is this issue already tracked in the umbrella issue for v1beta2?

@sbueringer
Copy link
Member

Now, yes

@fabriziopandini fabriziopandini added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jul 17, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 17, 2024
@fabriziopandini fabriziopandini added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 17, 2024
@fabriziopandini
Copy link
Member

fabriziopandini commented Oct 15, 2024

My personal take:

  • MachineDeployment.Spec.Paused should be dropped, It seems something we "copied" during the initial implementation of MD, but never designed properly.
  • We should not add .spec.Paused to all the objects because IMO pausing pausing single objects (instead of the entire cluster) should be an exception to be considered only by users that knowns very well what they are doing, so I prefer to not expose this as a spec field everywhere.

@ParkHyeongKyu
Copy link

I'm going through the same situation. I set MachineDeployment.Spec.Paced to True to prevent me from doing scale up or down when I increased or decreased replica, but it still does scale work.

This seems to be because the scale operation is still happening in the sync function as you can see in the link below.
https://github.com/kubernetes-sigs/cluster-api/blob/main/internal/controllers/machinedeployment/machinedeployment_sync.go#L48

@chrischdi
Copy link
Member

I'm going through the same situation. I set MachineDeployment.Spec.Paced to True to prevent me from doing scale up or down when I increased or decreased replica, but it still does scale work.

This seems to be because the scale operation is still happening in the sync function as you can see in the link below. main/internal/controllers/machinedeployment/machinedeployment_sync.go#L48

You should use the paused annotation if you want to prevent things from happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

8 participants