Skip to content

Commit 478069d

Browse files
authored
docs: add prometheus + grafana deployment guide (#1019)
* docs: add prometheus + grafana deployment guide * fix: rbac resources * refactor: change prometheus config file structure * fix: CR Suggestions
1 parent 3521803 commit 478069d

File tree

3 files changed

+107
-0
lines changed

3 files changed

+107
-0
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
apiVersion: v1
2+
kind: ServiceAccount
3+
metadata:
4+
name: inference-gateway-sa-metrics-reader
5+
namespace: monitoring
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
serviceAccounts:
2+
server:
3+
create: false
4+
name: inference-gateway-sa-metrics-reader
5+
6+
extraScrapeConfigs: |
7+
- job_name: 'inference-extension-epp'
8+
authorization:
9+
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
10+
scrape_interval: 5s
11+
kubernetes_sd_configs:
12+
- role: endpoints
13+
relabel_configs:
14+
- source_labels: [__meta_kubernetes_service_name]
15+
action: keep
16+
regex: .*-epp$
17+
- source_labels: [__meta_kubernetes_pod_container_port_number]
18+
action: keep
19+
regex: "9090"
20+
- job_name: vllm
21+
scrape_interval: 5s
22+
kubernetes_sd_configs:
23+
- role: pod
24+
relabel_configs:
25+
- source_labels: [__meta_kubernetes_pod_label_app]
26+
action: keep
27+
regex: vllm-llama3-8b-instruct

site-src/guides/metrics-and-observability.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,81 @@ PROFILE_NAME=heap
126126
curl -H "Authorization: Bearer $TOKEN" localhost:9090/debug/pprof/$PROFILE_NAME -o profile.out
127127
go tool pprof -png profile.out
128128
```
129+
## Setting Up Grafana + Prometheus
130+
131+
### Grafana
132+
133+
A simple grafana deployment can be done with the following commands:
134+
135+
```bash
136+
helm repo add grafana https://grafana.github.io/helm-charts
137+
helm install grafana grafana/grafana --namespace monitoring --create-namespace
138+
```
139+
140+
Get the Grafana URL to visit by running these commands in the same shell:
141+
142+
```bash
143+
kubectl -n monitoring port-forward deploy/grafana 3000:3000
144+
```
145+
146+
Get the generated password for the `admin` user:
147+
148+
```bash
149+
kubectl -n monitoring get secret grafana \
150+
-o go-template='{% raw %}{{ index .data "admin-password" | base64decode }}{% endraw %}'
151+
```
152+
153+
You can now access the Grafana UI from [http://127.0.0.1:3000](http://127.0.0.1:3000)
154+
155+
### Prometheus
156+
157+
We currently have 2 types of prometheus deployments documented:
158+
159+
1. Self Hosted using the prometheus helm chart
160+
2. Using Google Managed Prometheus
161+
162+
=== "Self-Hosted"
163+
164+
Create necessary ServiceAccount and RBAC resources:
165+
166+
```bash
167+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/observability/prometheus/rbac.yaml
168+
```
169+
170+
Patch the metrics reader ClusterRoleBinding to reference the new ServiceAccount:
171+
```bash
172+
kubectl patch clusterrolebinding inference-gateway-sa-metrics-reader-role-binding \
173+
--type='json' \
174+
-p='[{"op": "replace", "path": "/subjects/0/namespace", "value": "monitoring"}]'
175+
```
176+
177+
Add the prometheus-community helm repository:
178+
179+
```bash
180+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
181+
```
182+
183+
Deploy the prometheus helm chart using this command:
184+
```bash
185+
helm install prometheus prometheus-community/prometheus \
186+
--namespace monitoring \
187+
-f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/observability/prometheus/values.yaml
188+
```
189+
190+
You can add the prometheus data source to grafana following [This Guide](https://grafana.com/docs/grafana/latest/administration/data-source-management/).
191+
The prometheus server host is by default `http://prometheus-server`
192+
193+
Notice that the given values file is very simple and will work directly after following the [Getting Started Guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/), you might need to modify it
194+
195+
=== "Google Managed"
196+
197+
If you run the inference gateway with [Google Managed Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus), please follow the [instructions](https://cloud.google.com/stackdriver/docs/managed-prometheus/query)
198+
to configure Google Managed Prometheus as data source for the grafana dashboard.
199+
200+
## Load Inference Extension dashboard into Grafana
201+
202+
Please follow [grafana instructions](https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/) to load the dashboard json.
203+
The dashboard can be found here [Grafana Dashboard](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/tools/dashboards/inference_gateway.json)
129204

130205
## Prometheus Alerts
131206

0 commit comments

Comments
 (0)