Skip to content

'stable/etcd-operator' is not really stable for [coordination] #94

Closed
@arm4b

Description

@arm4b

After seeing the following in logs when cluster couldn't start itself or even start clean if all etcd pods were killed:

level=warning msg="all etcd pods are dead." cluster-name=etcd-cluster cluster-namespace=default pkg=cluster

This situation is not recovered by etcd-operator.
https://github.com/coreos/etcd-operator/blob/8347d27afa18b6c76d4a8bb85ad56a2e60927018/pkg/cluster/cluster.go#L248-L252

Researching further looks like there are quite a lot of cases when etcd-operator can't recover itself:


Because this backend is needed just for short-lived coordination locks, consider switching to Redis or even single-instance etcd like it was before (#52)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomershelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions