'stable/etcd-operator' is not really stable for [coordination]

After seeing the following in logs when cluster couldn't start itself or even start clean if all `etcd` pods were killed:
```
level=warning msg="all etcd pods are dead." cluster-name=etcd-cluster cluster-namespace=default pkg=cluster
```
This situation is not recovered by `etcd-operator`.
https://github.com/coreos/etcd-operator/blob/8347d27afa18b6c76d4a8bb85ad56a2e60927018/pkg/cluster/cluster.go#L248-L252

Researching further looks like there are quite a lot of cases when `etcd-operator` can't recover itself:
- https://github.com/coreos/etcd-operator/issues/1973
- https://github.com/coreos/etcd-operator/issues/1559
- https://github.com/coreos/etcd-operator/issues/1972
- https://github.com/coreos/etcd-operator/issues/2044

----

Because this backend is needed just for short-lived coordination locks, consider switching to `Redis` or even single-instance `etcd` like it was before (https://github.com/StackStorm/stackstorm-ha/pull/52)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

'stable/etcd-operator' is not really stable for [coordination] #94

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

'stable/etcd-operator' is not really stable for [coordination] #94

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions