Skip to content

leader election occasionally fails to reconnect to api server #66

Closed
@msau42

Description

@msau42

Exact root cause is still uncertain, but when apiserver is having problems, the csi sidecars will fail to get the leader election lease with this error:

"error retrieving resource lock kube-system/external-attacher-leader-my-driver: Get https://localhost:443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/external-attacher-leader-my-driver: write tcp [::1]:53540->[::1]:443: write: broken pipe"

Even after apiserver comes back up, this error continues and never recovers. This is apparently intended behavior, and the fix is to enable watchdog so that kubelet can restart the container: https://github.com/kubernetes/client-go/blob/master/tools/leaderelection/healthzadaptor.go#L25

In-tree controllers like kube-controller-manager already set this.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions