Skip to content
#

runbooks

Here are 20 public repositories matching this topic...

A production-style SRE learning project demonstrating Kubernetes reliability patterns, failure handling, and observability using FastAPI, PostgreSQL, Prometheus, and Grafana. Built to understand monitoring, alerting, and recovery in cloud-native systems through intentional chaos experiments.

  • Updated Feb 7, 2026
  • Python

Improve this page

Add a description, image, and links to the runbooks topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the runbooks topic, visit your repo's landing page and select "manage topics."

Learn more