epochcloud-test

Test repository for EpochCloud Kubernetes cluster CI/CD pipeline testing.

Quick Links

🌐 Live Sites	📦 Repos
🧪 Test (Prod)	☁️ EpochCloud Infra
🔬 Staging
🧑‍💻 Dev

Purpose

This is a proof-of-concept app demonstrating the complete EpochCloud deployment flow and observability stack integration.

App repos should be minimal - just source code and a Dockerfile. Everything else (deployment manifests, CI pipelines, monitoring) lives in the infra repo.

What's in this repo (app concerns)

epochcloud-test/
├── Dockerfile              # How to build the app
├── main.go, go.mod         # Source code with OTEL + slog
├── VERSION                 # App version
└── README.md               # This file

What's in the infra repo (platform concerns)

epochcloud/
├── kubernetes/apps/epochcloud-test/    # Deployment manifests + PrometheusRule
└── kubernetes/infrastructure/       # CI pipelines (Argo Workflows)

Complete Deployment Flow

1. DEVELOPER PUSHES CODE
   └── Push to EpochBoy/epochcloud-test main branch

2. ARGO WORKFLOWS CI (webhook triggered)
   └── GitHub App EventSource triggers app-baseline pipeline:
       ├── Pre-build: Semgrep SAST, TruffleHog secrets, OSV-Scanner SCA
       ├── Build: Buildah container build + push to Harbor
       └── Post-build: Trivy scan, Grype CVE, Syft SBOM, Cosign signing

3. IMAGE PUSHED TO HARBOR
   └── registry.<your-domain>/epochcloud/epochcloud-test:<sha>

4. KARGO PROMOTES THROUGH ENVIRONMENTS
   Each promotion triggers an Argo Rollout with canary analysis:

   DEV (auto-promote)
   └── Rollout: 10% → analysis → 25% → 50% → analysis → 75% → 100%
   └── Prometheus checks error rate, latency, success rate
   └── Auto-rollback if analysis fails
       ↓
   STAGING (auto-promote)
   └── Same canary rollout with Prometheus analysis
   └── OWASP ZAP DAST scan as Kargo verification gate
       ↓
   PRODUCTION (manual promote via Kargo UI)
   └── Same canary rollout with Prometheus analysis
   └── Traffic split via Traefik weighted TraefikService

Local Development

# Run locally
go run main.go

# Build container
docker build -t epochcloud-test .

# Test endpoints
curl http://localhost:8080/health
curl http://localhost:8080/version
curl http://localhost:8080/metrics

Endpoints

Endpoint	Description
`GET /`	Homepage with observability info
`GET /health`	Health check (for Kubernetes probes)
`GET /version`	Version info (commit, build time, environment)
`GET /metrics`	Prometheus metrics (scraped automatically)
`GET /chaos?action=X`	Chaos testing for AlertManager → ntfy

Observability Stack Integration

This app demonstrates full observability integration with the EpochCloud platform:

📈 Prometheus Metrics

The /metrics endpoint exposes:

Metric	Type	Description
`epochcloud_http_requests_total`	Counter	Total HTTP requests by method, path, status
`epochcloud_http_request_duration_seconds`	Histogram	Request latency (p50, p95, p99)
`epochcloud_app_info`	Gauge	App metadata (version, commit, environment)
`epochcloud_active_requests`	Gauge	Currently active requests
`epochcloud_errors_total`	Counter	Errors by type

📋 Loki Structured Logging

Using Go's slog package for JSON structured logs:

{
  "time": "2025-01-05T12:00:00Z",
  "level": "INFO",
  "msg": "request completed",
  "service": "epochcloud-test",
  "version": "1.2.3",
  "environment": "prod",
  "hostname": "epochcloud-test-abc123",
  "method": "GET",
  "path": "/health",
  "status": 200,
  "duration_seconds": 0.001,
  "trace_id": "abc123def456"
}

Logs are collected by Grafana Alloy (DaemonSet) and shipped to Loki.

🔍 Tempo Distributed Tracing

OpenTelemetry instrumentation sends traces to Tempo via Grafana Alloy (OTLP receiver):

All HTTP handlers create spans
Trace IDs are logged for correlation (Loki → Tempo)
Uses otelhttp middleware for automatic HTTP tracing
Exemplars attach trace_id to histogram observations for metric→trace drilldown
Flow: App (OTLP) → Alloy → Tempo → Grafana

🔔 AlertManager → ntfy Alerts

PrometheusRule defines alerts that fire to ntfy via webhook:

Alert	Condition	Severity
`EpochCloudTestHighErrorRate`	>5% errors over 5m	warning
`EpochCloudTestHighLatency`	P99 > 500ms	warning
`EpochCloudTestDown`	No instances running	critical
`EpochCloudTestHighLoad`	>50 concurrent requests	info

🔥 Chaos Testing

Test the full alert pipeline with chaos endpoints:

# Trigger 500 errors - tests error rate alert
curl https://test.<your-domain>/chaos?action=error

# Add 2s latency - tests latency alert  
curl https://test.<your-domain>/chaos?action=slow

# Simulate 50 concurrent requests - tests load alert
curl https://test.<your-domain>/chaos?action=load&count=50

Alert Flow:

/chaos?action=error → epochcloud_errors_total ↑ → Prometheus scrapes →
AlertManager fires EpochCloudTestHighErrorRate → ntfy webhook →
ntfy.epochcloud-warning topic → mobile notification

Platform Integration

Component	How it integrates
PodMonitor	Auto-discovers pods with `app: epochcloud-test` label
Grafana Alloy	Collects JSON logs → Loki, receives OTLP traces → Tempo
PrometheusRule	Defines alerts → AlertManager → ntfy
Kargo + Argo Rollouts	Promotes images with canary analysis
ArgoCD	GitOps deployment from infra repo
Exemplars	Histogram metrics include trace_id for Grafana drilldown

Environment Variables

Variable	Description	Default
`PORT`	HTTP server port	`8080`
`ENVIRONMENT`	Environment name (dev/staging/prod)	`dev`
`OTEL_EXPORTER_OTLP_ENDPOINT`	Alloy OTLP receiver endpoint	`alloy.alloy.svc.cluster.local:4317`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

epochcloud-test

Quick Links

Purpose

What's in this repo (app concerns)

What's in the infra repo (platform concerns)

Complete Deployment Flow

Local Development

Endpoints

Observability Stack Integration

📈 Prometheus Metrics

📋 Loki Structured Logging

🔍 Tempo Distributed Tracing

🔔 AlertManager → ntfy Alerts

🔥 Chaos Testing

Platform Integration

Environment Variables

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Dockerfile		Dockerfile
README.md		README.md
VERSION		VERSION
go.mod		go.mod
go.sum		go.sum
main.go		main.go
renovate.json		renovate.json

EpochBoy/epochcloud-test

Folders and files

Latest commit

History

Repository files navigation

epochcloud-test

Quick Links

Purpose

What's in this repo (app concerns)

What's in the infra repo (platform concerns)

Complete Deployment Flow

Local Development

Endpoints

Observability Stack Integration

📈 Prometheus Metrics

📋 Loki Structured Logging

🔍 Tempo Distributed Tracing

🔔 AlertManager → ntfy Alerts

🔥 Chaos Testing

Platform Integration

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages