Skip to content

Commit 66150e5

Browse files
open telemetry on kubernetes
1 parent a652cb6 commit 66150e5

File tree

7 files changed

+354
-0
lines changed

7 files changed

+354
-0
lines changed
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# Introduction to OpenTelemetry Operator
2+
3+
OpenTelemetry Operator [documentation](https://opentelemetry.io/docs/platforms/kubernetes/operator/)
4+
5+
## We need a Kubernetes cluster
6+
7+
Lets create a Kubernetes cluster to play with using [kind](https://kind.sigs.k8s.io/docs/user/quick-start/)
8+
9+
```
10+
kind create cluster --name otel --image kindest/node:v1.34.0
11+
```
12+
13+
Test our cluster:
14+
15+
```
16+
kubectl get nodes
17+
NAME STATUS ROLES AGE VERSION
18+
otel-control-plane Ready control-plane 40s v1.34.0
19+
```
20+
21+
## Helm charts
22+
23+
### OpenTelemetry Operator chart
24+
```
25+
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
26+
helm repo update
27+
28+
helm search repo open-telemetry --versions
29+
```
30+
31+
We'll install version `0.93.0` at the time of this guide.
32+
I would suggest to make sure the version you pick is compatible with the Kubernetes version you are running. </br>
33+
34+
```
35+
OTEL_VERSION=0.93.0
36+
```
37+
38+
### Cert-manager chart
39+
40+
We will need cert-manager deployed which Otel uses for local TLS certificate management
41+
42+
```
43+
helm repo add jetstack https://charts.jetstack.io
44+
45+
helm search repo jetstack --versions
46+
```
47+
48+
We'll install version `v1.18.2` of cert-manager at the time of this guide which is compatible with our Kubernetes version. </br>
49+
50+
```
51+
CERTMANAGER_VERSION=v1.18.2
52+
```
53+
54+
### Install helm charts
55+
56+
Install cert-manager:
57+
58+
```
59+
helm install \
60+
cert-manager jetstack/cert-manager \
61+
--namespace cert-manager \
62+
--create-namespace \
63+
--version $CERTMANAGER_VERSION \
64+
--set crds.enabled=true \
65+
--set startupapicheck.timeout="5m"
66+
```
67+
68+
Install OpenTelemetry Operator:
69+
70+
```
71+
helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
72+
--namespace opentelemetry-operator-system \
73+
--create-namespace \
74+
--version $OTEL_VERSION \
75+
--values=monitoring/opentelemetry/kubernetes/values.yaml
76+
```
77+
78+
View our install:
79+
80+
```
81+
kubectl get pods -n cert-manager
82+
kubectl get pods -n opentelemetry-operator-system
83+
```
84+
85+
## Create a Collector
86+
87+
In this guide I copied a starter collector from the [official documentation](https://opentelemetry.io/docs/platforms/kubernetes/operator/#getting-started) and changed some of the values.
88+
89+
To split out the guides, we'll have separate collectors for logs, metrics and traces.
90+
91+
Let's deploy our collectors in a central `monitoring` namespace:
92+
93+
```
94+
95+
kubectl create namespace monitoring
96+
97+
kubectl apply -n monitoring -f monitoring/opentelemetry/kubernetes/collector-tracing.yaml
98+
```
99+
100+
View our collector:
101+
102+
```
103+
kubectl -n monitoring get pods
104+
kubectl -n monitoring get svc
105+
```
106+
107+
## Tracing
108+
109+
### Instrumentation
110+
111+
To start receiving traces, we'll be using the OpenTelemetry [auto instrumentation](https://opentelemetry.io/docs/platforms/kubernetes/operator/automatic/) to inject the instrumentation into our Kubernetes pods.
112+
113+
To make use of the auto injection for Opentelemetry, we use the `Instrumentation` resources.
114+
115+
Note that I am creating this in the default namespace next to where my applications are.
116+
117+
```
118+
kubectl apply -f monitoring/opentelemetry/kubernetes/instrumentation.yaml
119+
```
120+
121+
### Deploy Microservices
122+
123+
Build applications:
124+
125+
```
126+
docker compose --file monitoring/opentelemetry/docker-compose.yaml build
127+
```
128+
129+
Load images into `kind`:
130+
131+
```
132+
kind load docker-image aimvector/service-mesh:videos-web-1.0.0 --name otel
133+
kind load docker-image aimvector/service-mesh:playlists-api-1.0.0 --name otel
134+
kind load docker-image aimvector/service-mesh:videos-api-1.0.0 --name otel
135+
```
136+
137+
```
138+
kubectl apply -f monitoring/opentelemetry/applications/playlists-api/
139+
kubectl apply -f monitoring/opentelemetry/applications/playlists-db/
140+
kubectl apply -f monitoring/opentelemetry/applications/videos-web/
141+
kubectl apply -f monitoring/opentelemetry/applications/videos-db/
142+
kubectl apply -f monitoring/opentelemetry/applications/videos-api/
143+
```
144+
145+
Generate some traffic with `port-forward`
146+
147+
```
148+
kubectl port-forward svc/videos-web 80:80
149+
kubectl port-forward svc/playlists-api 81:80
150+
```
151+
152+
### Tracing Data Store
153+
154+
In this guide, we'll use a Tempo database for tracing data
155+
156+
```
157+
helm repo add grafana https://grafana.github.io/helm-charts
158+
helm repo update
159+
160+
helm search repo grafana/tempo
161+
162+
TEMPO_VERSION=1.23.3
163+
164+
helm upgrade tempo grafana/tempo \
165+
--create-namespace \
166+
--namespace grafana \
167+
--version $TEMPO_VERSION \
168+
--values monitoring/opentelemetry/kubernetes/tempo.yaml
169+
```
170+
171+
### Metrics Data Store
172+
173+
In this guide, we'll use a Prometheus for our metrics data
174+
175+
```
176+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
177+
helm search repo prometheus-community --versions
178+
179+
PROMETHEUS_STACK_VERSION=77.5.0
180+
181+
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
182+
--version ${PROMETHEUS_STACK_VERSION} \
183+
--namespace prometheus-operator-system \
184+
--create-namespace \
185+
--set prometheusOperator.enabled=true \
186+
--set prometheusOperator.nodeSelector."kubernetes\.io/os"=linux \
187+
--set prometheusOperator.fullnameOverride="prometheus-operator" \
188+
--set prometheusOperator.manageCrds=true \
189+
--set alertmanager.enabled=false \
190+
--set grafana.enabled=false \
191+
--set prometheus-node-exporter.enabled=false \
192+
--set nodeExporter.enabled=false \
193+
--set kubeStateMetrics.enabled=false \
194+
--set prometheus.enabled=false
195+
196+
kubectl -n prometheus-operator-system get pods
197+
198+
# Deploy our dedicated Prometheus for metrics storage
199+
200+
kubectl apply -n monitoring -f monitoring/opentelemetry/kubernetes/prometheus.yaml
201+
```
202+
203+
### Dashboards
204+
205+
In this guide I use a simple Grafana for dashboards
206+
207+
```
208+
helm search repo grafana/grafana
209+
210+
GRAFANA_VERSION=9.4.4
211+
212+
helm install grafana grafana/grafana \
213+
--namespace grafana \
214+
--version $GRAFANA_VERSION \
215+
--values monitoring/opentelemetry/kubernetes/grafana.yaml
216+
217+
```
218+
219+
### Access Grafana
220+
221+
We can `port-forward` to Grafana
222+
223+
```
224+
kubectl -n grafana port-forward svc/grafana 3000:80
225+
```
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
apiVersion: opentelemetry.io/v1beta1
2+
kind: OpenTelemetryCollector
3+
metadata:
4+
name: trace-collector
5+
spec:
6+
image: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.131.1
7+
config:
8+
receivers:
9+
otlp:
10+
protocols:
11+
grpc:
12+
endpoint: 0.0.0.0:4317
13+
http:
14+
endpoint: 0.0.0.0:4318
15+
processors:
16+
memory_limiter:
17+
check_interval: 1s
18+
limit_percentage: 75
19+
spike_limit_percentage: 15
20+
batch:
21+
send_batch_size: 10000
22+
timeout: 10s
23+
24+
exporters:
25+
# NOTE: Prior to v0.86.0 use `logging` instead of `debug`.
26+
debug: {}
27+
otlp/tempo:
28+
endpoint: "http://tempo.grafana.svc.cluster.local:4317"
29+
tls:
30+
insecure: true
31+
prometheusremotewrite:
32+
endpoint: "http://prometheus-operated:9090/api/v1/write"
33+
tls:
34+
insecure: true
35+
insecure_skip_verify: true
36+
service:
37+
pipelines:
38+
traces:
39+
receivers: [otlp]
40+
processors: [memory_limiter, batch]
41+
exporters: [debug, otlp/tempo]
42+
metrics:
43+
receivers: [otlp]
44+
processors: [batch]
45+
exporters: [prometheusremotewrite]
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
datasources:
2+
datasources.yaml:
3+
apiVersion: 1
4+
datasources:
5+
- name: Prometheus
6+
type: prometheus
7+
url: http://prometheus-operated.monitoring.svc.cluster.local:9090
8+
access: proxy
9+
isDefault: true
10+
jsonData:
11+
httpMethod: GET
12+
- name: Tempo
13+
type: tempo
14+
url: http://tempo.grafana.svc.cluster.local:3200
15+
access: proxy
16+
isDefault: false
17+
jsonData:
18+
httpMethod: GET
19+
serviceMap:
20+
datasourceUid: 'Prometheus'
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
apiVersion: opentelemetry.io/v1alpha1
2+
kind: Instrumentation
3+
metadata:
4+
name: app-instrumentation
5+
spec:
6+
exporter:
7+
endpoint: http://trace-collector-collector.monitoring.svc.cluster.local:4318
8+
propagators:
9+
- tracecontext
10+
- baggage
11+
sampler:
12+
type: parentbased_traceidratio
13+
argument: '1'
14+
dotnet:
15+
env:
16+
- name: OTEL_EXPORTER_OTLP_TRACES_PROTOCOL
17+
value: "grpc"
18+
- name: OTEL_LOG_LEVEL
19+
value: "error"
20+
- name: OTEL_DOTNET_AUTO_LOG_DIRECTORY
21+
value: /tmp
22+
- name: "OTEL_DOTNET_AUTO_METRICS_INSTRUMENTATION_ENABLED"
23+
value: "true"
24+
- name: "OTEL_DOTNET_AUTO_METRICS_INSTRUMENTATION_ASPNETCORE_ENABLED"
25+
value: "true"
26+
- name: "OTEL_DOTNET_AUTO_METRICS_INSTRUMENTATION_HTTPCLIENT_ENABLED"
27+
value: "true"
28+
go:
29+
env:
30+
- name: OTEL_EXPORTER_OTLP_PROTOCOL
31+
value: "grpc"
32+
- name: OTEL_PROPAGATORS
33+
value: "tracecontext,baggage"
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
apiVersion: monitoring.coreos.com/v1
2+
kind: Prometheus
3+
metadata:
4+
name: prometheus
5+
spec:
6+
enableRemoteWriteReceiver: true
7+
enableFeatures:
8+
- remote-write-receiver
9+
evaluationInterval: 30s
10+
imagePullPolicy: IfNotPresent
11+
nodeSelector:
12+
kubernetes.io/os: linux
13+
replicas: 1
14+
retention: 10d
15+
scrapeInterval: 30s
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
server:
2+
http_listen_port: 3200
3+
tempo:
4+
metricsGenerator:
5+
# The metricsGenerator component converts traces into Prometheus metrics.
6+
enabled: true
7+
remoteWriteUrl: "http://prometheus-operated.monitoring.svc.cluster.local:9090/api/v1/write"
8+
overrides:
9+
defaults:
10+
metrics_generator:
11+
processors:
12+
- service-graphs
13+
- span-metrics
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-operator
2+
nodeSelector:
3+
"kubernetes.io/os": linux

0 commit comments

Comments
 (0)