Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions cmd/epp/runner/runner.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"context"
"flag"
"fmt"
"net/http/pprof"

"github.com/go-logr/logr"
"github.com/prometheus/client_golang/prometheus"
Expand Down Expand Up @@ -215,6 +216,11 @@ func (r *Runner) Run(ctx context.Context) error {
setupLog.Error(err, "Failed to create controller manager")
return err
}
err = setupPprofHandlers(mgr)
if err != nil {
setupLog.Error(err, "Failed to setup pprof handlers")
return err
}

if len(*configText) != 0 || len(*configFile) != 0 {
theConfig, err := loader.LoadConfig([]byte(*configText), *configFile)
Expand Down Expand Up @@ -403,3 +409,24 @@ func verifyMetricMapping(mapping backendmetrics.MetricMapping, logger logr.Logge
logger.Info("Not scraping metric: LoraRequestInfo")
}
}

// setupPprofHandlers only implements the pre-defined profiles:
// https://cs.opensource.google/go/go/+/refs/tags/go1.24.4:src/runtime/pprof/pprof.go;l=108
func setupPprofHandlers(mgr ctrl.Manager) error {
var err error
profiles := []string{
"heap",
"goroutine",
"allocs",
"threadcreate",
"block",
"mutex",
}
for _, p := range profiles {
err = mgr.AddMetricsServerExtraHandler("/debug/pprof/"+p, pprof.Handler(p))
if err != nil {
return err
}
}
return nil
}
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ nav:
- Rollout:
- Adapter Rollout: guides/adapter-rollout.md
- InferencePool Rollout: guides/inferencepool-rollout.md
- Metrics: guides/metrics.md
- Metrics and Observability: guides/metrics-and-observability.md
- Configuration Guide:
- Prefix Cache Aware Plugin: guides/epp-configuration/prefix-aware.md
- Implementer's Guide: guides/implementers.md
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Metrics
# Metrics & Observability

This guide describes the current state of exposed metrics and how to scrape them.
This guide describes the current state of exposed metrics and how to scrape them, as well as accessing pprof profiles.

## Requirements

Expand Down Expand Up @@ -53,7 +53,7 @@ This guide describes the current state of exposed metrics and how to scrape them
|:---------------------------|:-----------------|:-------------------------------------------------|:------------------------------------------|:------------|
| lora_syncer_adapter_status | Gauge | Status of LoRA adapters (1=loaded, 0=not_loaded) | `adapter_name`=<adapter-id> | ALPHA |

## Scrape Metrics
## Scrape Metrics & Pprof profiles

The metrics endpoints are exposed on different ports by default:

Expand All @@ -73,6 +73,7 @@ metadata:
rules:
- nonResourceURLs:
- /metrics
- /debug/pprof/*
verbs:
- get
---
Expand Down Expand Up @@ -116,6 +117,16 @@ kubectl -n default port-forward inference-gateway-ext-proc-pod-name 9090
curl -H "Authorization: Bearer $TOKEN" localhost:9090/metrics
```

### Pprof profiles

Currently only the [predefined profiles](https://pkg.go.dev/runtime/pprof#Profile) are supported, CPU profiling will require code changes. Assuming the EPP has been port-forwarded as in the above example, to get the PGN display of the `heap` profile simply run:

```
PROFILE_NAME=heap
curl -H "Authorization: Bearer $TOKEN" localhost:9090/debug/pprof/$PROFILE_NAME -o profile.out
go tool pprof -png profile.out
```

## Prometheus Alerts

The section instructs how to configure prometheus alerts using collected metrics.
Expand Down