Skip to content

Provide Runbook PrometheusRules #1556

@eumel8

Description

@eumel8

Is your feature request related to a problem? Please describe.

The Kube Logging Operator has already a lot of well documented PrometheusRules included, which have a good readable AlertName, a short summary and description. On a daily work it's easy to spot an error and start working on the root cause.
For the unskilled stuff like OnCall team or 1st level support it's a bit overwhelmed without any deeper knowledge in the architecture which alert is related or what are to do to investigate or solve the issue.

Describe the solution you'd like

The typical use case for the Operations team is to use the Runbook feature of PrometheusRule/Annotation. Best reference is the Runbook of Prometheus project itself.
Basically in the PrometheusRule is a link to an web service with additionally instruction to the related alert. This is easy to manage, everybody can contribute to the documentation and improve the working steps.

I started a proposal here

Describe alternatives you've considered

Alternative you can put all this information in the PrometheusRule itself. But that's more static and needs more cluster resources.

Additional context

https://en.wikipedia.org/wiki/Runbook

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions