Skip to content

Find out why h's workers didn't reconnect automatically #7005

Open
@indigobravo

Description

@indigobravo

Try to find out why h's workers didn't reconnect automatically and fix that if we can. The system is supposed to be self-recovering so that even with a helpless engineer on call all should be well. It actually didn't self-recover: it took engineers to spot that workers were not re-connecting and re-deploy h. (Although to be fair we had multiple alarms still telling us that workers were still not working, so it's not like we had to spot it in the logs ourselves.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions