SRE
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
Here are 32 public repositories matching this topic...
Open source AI terminal and SSH Client for EC2, Database and Kubernetes.
-
Updated
Sep 29, 2025 - TypeScript
New Relic One quickstarts help accelerate your New Relic journey by providing immediate value for your specific use cases.
-
Updated
Oct 4, 2025 - TypeScript
Create custom DevOps AI agents that understand and manage your infrastructure.
-
Updated
Feb 27, 2025 - TypeScript
Collection of AWS Fault Injection Simulator (FIS) experiment templates deploy-able via the AWS CDK
-
Updated
Nov 9, 2022 - TypeScript
Configuration as code for the masses
-
Updated
Nov 20, 2021 - TypeScript
Everything you need to build, deploy, and collaborate with agents. Ride the llama, avoid the drama.
-
Updated
Oct 3, 2025 - TypeScript
A prometheus exporter exposing metrics for the official MongoDB Node.js driver.
-
Updated
Sep 26, 2025 - TypeScript
InfraGPT is an AI SRE Copilot for the Cloud that provides infrastructure management agents through Slack integration. The system consists of multiple services that work together to deliver intelligent DevOps workflows.
-
Updated
Sep 12, 2025 - TypeScript
A prometheus exporter for node-postgres
-
Updated
Sep 26, 2025 - TypeScript
SRE Agent for VS Code
-
Updated
Jan 29, 2025 - TypeScript
A prometheus exporter exposing metrics for KafkaJS
-
Updated
Sep 26, 2025 - TypeScript
Tool to coordinate on-call, incident and maintenance management
-
Updated
Dec 16, 2021 - TypeScript
Rubixkube AI - Site Reliability Intelligence platform with AI agents that detect, diagnose, and heal infrastructure issues automatically. Built with Next.js 15, featuring autonomous incident response, real-time monitoring, and human-in-the-loop guardrails for Kubernetes and cloud environments.
-
Updated
Sep 27, 2025 - TypeScript
- Followers
- 136 followers
- Website
- github.com/topics/sre
- Wikipedia
- Wikipedia