-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure the v1 CRDs exist before starting the controller #504
Conversation
Welcome @mauriciopoppe! |
Hi @mauriciopoppe. Thanks for your PR. I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cmd/snapshot-controller/main.go
Outdated
|
||
condition := func() (bool, error) { | ||
var err error | ||
_, err = client.SnapshotV1().VolumeSnapshots("kube-system").List(context.TODO(), metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VolumeSnapshots
needs a namespace, I was not sure if this should be kube-system although I'm not sure it exists in any cluster or if I should just check for the VolumeSnapshotClasses / VolumeSnapshotContents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assume VolumeSnapshots are in "kube-system" namespace.
Also I see that you are checking if there are any VolumeSnapshot/VolumeSnapshotContent/VolumeSnapshotClass resources. Can you just check if the CRDs are deployed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the check is that the endpoint exists, it doesn't matter if there's nothing in the namespace.
I was also thinking of checking the CRD directly but that will require new rbacs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the response of the list function is err, not only caused by the crd that not exists, maybe the problem of the network can effect it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if the specific error indicates CRDs are not deployed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe both calls are the same, I did the following experiment with volumesnapshotclasses
The dynamic client is returning unstructured data of the results of GET volumesnapshots (not of the CRD), I tried to make a request with kube-proxy and curl:
- with the dynamic client we're doing a
GET /apis/GROUP/VERSION/RESOURCE/RESOURCENAME
# v1beta1
curl localhost:8080/apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotclasses/
{"apiVersion":"snapshot.storage.k8s.io/v1beta1","items":[],"kind":"VolumeSnapshotClassList","metadata":{"continue":"","resourceVersion":"58636","selfLink":"/apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotclasses/"}}
# v1
curl localhost:8080/apis/snapshot.storage.k8s.io/v1/volumesnapshotclasses/
404 page not found
- with the snapshot v1 client and kubectl I saw the following
# v1beta1
kubectl get volumesnapshotclasses.v1beta1.snapshot.storage.k8s.io --v=7
I0422 08:06:43.695353 77906 loader.go:375] Config loaded from file: bmctl-workspace/ci-6ecffb9bde2ccf8/ci-6ecffb9bde2ccf8-kubeconfig
I0422 08:06:43.705084 77906 round_trippers.go:421] GET https://10.200.0.5:443/apis/snapshot.storage.k8s.io/v1beta1/volumesnapshotclasses?limit=500
I0422 08:06:43.705120 77906 round_trippers.go:428] Request Headers:
I0422 08:06:43.705131 77906 round_trippers.go:432] Accept: application/json;as=Table;v=v1;g=meta.k8s.io,application/json;as=Table;v=v1beta1;g=meta.k8s.io,application/json
I0422 08:06:43.705151 77906 round_trippers.go:432] User-Agent: kubectl/v1.19.7 (linux/amd64) kubernetes/1dd5338
I0422 08:06:43.714630 77906 round_trippers.go:447] Response Status: 200 OK in 9 milliseconds
No resources found
#v1
kubectl get volumesnapshotclasses.v1.snapshot.storage.k8s.io --v=6
I0422 08:18:48.320835 83848 loader.go:375] Config loaded from file: bmctl-workspace/ci-6ecffb9bde2ccf8/ci-6ecffb9bde2ccf8-kubeconfig
I0422 08:18:48.329195 83848 discovery.go:214] Invalidating discovery information
I0422 08:18:48.338454 83848 round_trippers.go:444] GET https://10.200.0.5:443/api?timeout=32s 200 OK in 9 milliseconds
I0422 08:18:48.348421 83848 round_trippers.go:444] GET https://10.200.0.5:443/apis?timeout=32s 200 OK in 1 milliseconds
... lots of OKs ...
F0422 08:18:48.792106 83848 helpers.go:115] error: the server doesn't have a resource type "volumesnapshotclasses"
The endpoints for both the dynamic client and the snapshot v1 client are the same, if our objective is to check the CRDs I think I could use another client like the apiextensions client to get the CRDs, check that they have the v1 version and check that the CRD has the Established
condition as the docs say e.g.
kubectl get crd volumesnapshots.snapshot.storage.k8s.io --v=7 -o custom-columns=NAME:.metadata.name,VERSIONS:.spec.versions.*.name,CONDITIONS:.status.conditions.*.type
I0422 08:17:25.251236 83018 loader.go:375] Config loaded from file: bmctl-workspace/ci-6ecffb9bde2ccf8/ci-6ecffb9bde2ccf8-kubeconfig
I0422 08:17:25.260059 83018 round_trippers.go:421] GET https://10.200.0.5:443/apis/apiextensions.k8s.io/v1/customresourcedefinitions/volumesnapshots.snapshot.storage.k8s.io
I0422 08:17:25.260089 83018 round_trippers.go:428] Request Headers:
I0422 08:17:25.260096 83018 round_trippers.go:432] Accept: application/json
I0422 08:17:25.260101 83018 round_trippers.go:432] User-Agent: kubectl/v1.19.7 (linux/amd64) kubernetes/1dd5338
I0422 08:17:25.273409 83018 round_trippers.go:447] Response Status: 200 OK in 13 milliseconds
I0422 08:17:25.274596 83018 table_printer.go:45] Unable to decode server response into a Table. Falling back to hardcoded types: attempt to decode non-Table object
NAME VERSIONS CONDITIONS
volumesnapshots.snapshot.storage.k8s.io v1beta1 KubernetesAPIApprovalPolicyConformant,NamesAccepted,Established
Thanks for the comments! This was a really useful exercise to know more about how the API server works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if the specific error indicates CRDs are not deployed?
I'm not sure it's important to check that specific error. If we're not able to access the API server due to network issues, that's also a problem (although may be temporary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @msau42 mentioned this change is doable however we'd need to add new rules to the RBAC e.g.
- apiGroups:
- apiextensions.k8s.io
resources:
- customresourcedefinitions
verbs:
- get
I didn't try this in a cluster yet but this would be an approach:
// Checks that the VolumeSnapshot v1 CRDs exist and that they have the Established condition.
func ensureCRDsExist(client apiextensionsclientset.Interface) error {
crdHasV1Version := func(crd *apiextensionsv1.CustomResourceDefinition) bool {
for _, version := range crd.Spec.Versions {
if version.Name == snapshotv1.SchemeGroupVersion.Version {
return true
}
}
return false
}
crdHasEstablishedCondition := func(crd *apiextensionsv1.CustomResourceDefinition) bool {
for _, condition := range crd.Status.Conditions {
if condition.Type == apiextensionsv1.Established {
return true
}
}
return false
}
condition := func() (bool, error) {
groupResources := []string{
snapshotv1.Resource("volumesnapshots").String(),
snapshotv1.Resource("volumesnapshotclasses").String(),
snapshotv1.Resource("volumesnapshotcontents").String(),
}
for _, groupResource := range groupResources {
crd, err := client.ApiextensionsV1().CustomResourceDefinitions().Get(context.TODO(), groupResource, metav1.GetOptions{})
if err != nil {
klog.Errorf("Failed to get CRD %s with error: %+v", groupResource, err)
return false, nil
}
// Check that the CRD has the v1 version and that it has the Established condition
// The Established condition was mentioned in https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition
if !crdHasV1Version(crd) {
klog.Errorf("CRD %s doesn't declare a v1 version", groupResource)
return false, nil
}
if !crdHasEstablishedCondition(crd) {
klog.Errorf("CRD %s doesn't have an Established condition", groupResource)
return false, nil
}
}
return true, nil
}
// with a Factor of 1.5 we wait up to 7.5 seconds (the 10th attempt)
backoff := wait.Backoff{
Duration: 100 * time.Millisecond,
Factor: 1.5,
Steps: 10,
}
if err := wait.ExponentialBackoff(backoff, condition); err != nil {
return err
}
return nil
}
I believe that just by testing that the endpoint exist with the snapshotclient might give us a good indication that the CRD had an Established condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VolumeSnapshots
needs a namespace, I was not sure if this should be kube-system although I'm not sure it exists in any cluster or if I should just check for the VolumeSnapshotClasses / VolumeSnapshotContents
if we deploy the sidecar under other namespace, it can list the resoures under the kube-system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pointing this out, I created #514 to fix this
/ok-to-test |
/assign @xing-yang |
a0a6186
to
cbabf04
Compare
cmd/snapshot-controller/main.go
Outdated
} | ||
|
||
pollPeriod := 100 * time.Millisecond | ||
if err := wait.PollImmediateUntil(pollPeriod, condition, timeoutCh); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we prefer retrying with exponential backoff so that we don't potentially overload the API server with too many requests in a short time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good to know, I was thinking about timing out in less than 10s e.g.
// with a Factor of 1.5 we wait up to 7.5 seconds (the 10th attempt)
backoff := wait.Backoff{
Duration: 100 * time.Millisecond,
Factor: 1.5,
Steps: 10,
}
I've deployed the a new version and without the CRDs it's crashing after around 7.5 seconds. Logs:
Status:
After I deployed the CRD, I had to delete the v4 pod to make it work (after going to
|
1e5b3a9
to
810f2aa
Compare
Just found: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy So the restart's exponential backoff is capped at 5m, so hopefully after 5m after the last crash it'd start again |
cmd/snapshot-controller/main.go
Outdated
func ensureCustomResourceDefinitionsExist(client *clientset.Clientset) error { | ||
condition := func() (bool, error) { | ||
var err error | ||
_, err = client.SnapshotV1().VolumeSnapshots("kube-system").List(context.TODO(), metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure "kube-system" namespace exist in all Kubernetes distributions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes starts with four initial namespaces:
- default The default namespace for objects with no other namespace
- kube-system The namespace for objects created by the Kubernetes system
- kube-public This namespace is created automatically and is readable by all users (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.
- kube-node-lease This namespace for the lease objects associated with each node which improves the performance of the node heartbeats as the cluster scales.
I'm not sure if it can be deleted later though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a check to see if "kube-system" exist before checking if there are snapshot resources there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, if kube-system doesn't exist we skip the volumesnapshot list request
810f2aa
to
a628278
Compare
Can you verify that after 5 minutes it came back? |
Sure, same scenario but I waited until the deployment got to this state:
I applied the CRDs and then I saw:
Something I noticed this time was the downtime that exists between terminating the last v3 controller and one of the v4 controllers making a request to become the leader, it was around 5 seconds. |
@msau42 Is this question about the v3 controller coming back? AFAIK we'd need to do |
You mentioned earlier that you had to restart the v4 pod after installing the CRDs because it wasn't being restarted after failing for 5 minutes. I wanted to check that this was because of the issue Cheng mentioned about the backoff period becoming 5 minutes after that point. I would expect that after an additional 5 minutes, the v4 pod would get restarted. |
yeah, in the first experiment I waited around 5 minutes before I manually restarted the pod, I think I should've waited just a little bit more to see the pod restart, in the last experiment I just waited and it was restarted |
That makes sense: Line 42 in 679e71e
IMO this is pretty good; if retry period is 5s then the amount of downtime should often be less than 5s. |
LGTM from my perspective! |
Can you also update the release note to say something like "Add check for v1 CRDs to allow for rolling update of the snapshot-controller" /lgtm |
Can you squash your commits? |
6d44eb3
to
660420b
Compare
@xing-yang I've squashed the commits |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mauriciopoppe, xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Factor: 1.5, | ||
Steps: 10, | ||
} | ||
if err := wait.ExponentialBackoff(backoff, condition); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, why need to wait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait until the CRDs are installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if crash immediately and not retry, when the container restart again, it will check agian?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes however we'd like to have a faster startup time, if the container exits quickly and goes into CrashLoopBackOff
it'll be restarted every 10s, 20s, 40s and so on, during the first run instead of failing quickly it might be possible that the CRDs are ready after a few seconds and in one of these 10 runs the wait condition might succeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mauriciopoppe copy that, thanks.
What type of PR is this?
/kind bug
What this PR does / why we need it:
In the upgrade to the v1 version of VolumeSnapshots from v1beta with a deployment rollout of the controller it might be possible to get into a state where the v1 snapshot-controller is up but the CRDs for v1 aren't applied yet (from https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/)
To test it I had 2 replicas of the v3 controller in a cluster and I did
kubectl apply -f
with the following:Note
maxSurge: 0
andmaxUnavailable: 1
, also note that I didn't deploy the v1 CRDs, the order of events were:Because the v4 controllers can't read the v1 CRDs we believe that the container should fail to start, I also noted that a controller will use the informers and fetch info about the CRDs only after it becomes the leader.
With this change we ensure that there's a start period where the container cannot continue until it sees the CRDs, the order of events for the same scenario is:
CrashLoopBackOff
], the v3 controller is still the leaderLogs:
The controller gets stuck in this state until I deploy the v1 CRDs, the order of events after I deploy the v1 CRDs is:
Does this PR introduce a user-facing change?:
/assign @msau42 @verult