Keeping systems reliable, resilient, and observable β at scale.
βοΈ Cloud & Infra
AWS
GCP
Terraform
OpenShift
Kubernetes
Docker
π§° CI/CD & Automation
GitHub Actions
Jenkins
Ansible
π Observability
Prometheus
Grafana
Dynatrace
OpenTelemetry
π¦ Platforms
Linux
Debian/Ubuntu
RHEL
Bash
Python
Go
- Design and maintain highly available, fault-tolerant systems
- Create scalable CI/CD pipelines that ship code faster and safer
- Build infrastructure-as-code to ensure consistency and repeatability
- Implement monitoring, logging, and alerting to sleep better at night
- Guide engineering teams in reliability best practices and incident response
βHope is not a strategy. Automate everything, observe everything, break before it breaks.β
- π Observability > Monitoring
- π Immutable > Mutable
- π§ͺ Chaos > Complacency
- π Postmortems > Blame
Project | Description | Stack |
---|---|---|
π OpenShift Upgrade CUJ | ||
π Alert Architect | Dynamic alert tuning system to reduce noise & burnout | Prometheus, Dynatrace, OpenTelemetry |
- πΌ LinkedIn