Skip to content

Commit 4effca3

Browse files
author
Julien Pivotto
authored
Docker Swarm guide (prometheus#1696)
* Docker Swarm guide Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
1 parent a6d8366 commit 4effca3

File tree

1 file changed

+230
-0
lines changed

1 file changed

+230
-0
lines changed

content/docs/guides/dockerswarm.md

Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
---
2+
title: Docker Swarm
3+
sort_rank: 1
4+
---
5+
6+
# Docker Swarm
7+
8+
Prometheus can discover targets in a [Docker Swarm][swarm] cluster, as of
9+
v2.20.0. This guide demonstrates how to use that service discovery mechanism.
10+
11+
## Docker Swarm service discovery architecture
12+
13+
The [Docker Swarm service discovery][swarmsd] contains 3 different roles: nodes, services,
14+
and tasks.
15+
16+
The first role, **nodes**, represents the hosts that are part of the Swarm. It
17+
can be used to automatically monitor the Docker daemons or the Node Exporters
18+
who run on the Swarm hosts.
19+
20+
The second role, **tasks**, represents any individual container deployed in the
21+
swarm. Each task gets its associated service labels. One service can be backed by
22+
one or multiple tasks.
23+
24+
The third one, **services**, will discover the services deployed in the
25+
swarm. It will discover the ports exposed by the services. Usually you will want
26+
to use the tasks role instead of this one.
27+
28+
Prometheus will only discover tasks and service that expose ports.
29+
30+
NOTE: The rest of this post assumes that you have a Swarm running.
31+
32+
## Setting up Prometheus
33+
34+
For this guide, you need to [setup Prometheus][setup]. We will assume that
35+
Prometheus runs on a Docker Swarm manager node and has access to the Docker
36+
socket at `/var/run/docker.sock`.
37+
38+
## Monitoring Docker daemons
39+
40+
Let's dive into the service discovery itself.
41+
42+
Docker itself, as a daemon, exposes [metrics][dockermetrics] that can be
43+
ingested by a Prometheus server.
44+
45+
You can enable them by editing `/etc/docker/daemon.json` and setting the
46+
following properties:
47+
48+
```json
49+
{
50+
"metrics-addr" : "0.0.0.0:9323",
51+
"experimental" : true
52+
}
53+
```
54+
55+
Instead of `0.0.0.0`, you can set the IP of the Docker Swarm node.
56+
57+
A restart of the daemon is required to take the new configuration into account.
58+
59+
The [Docker documentation][dockermetrics] contains more info about this.
60+
61+
Then, you can configure Prometheus to scrape the Docker daemon, by providing the
62+
following `prometheus.yml` file:
63+
64+
65+
```yaml
66+
scrape_configs:
67+
# Make Prometheus scrape itself for metrics.
68+
- job_name: 'prometheus'
69+
static_configs:
70+
- targets: ['localhost:9090']
71+
72+
# Create a job for Docker daemons.
73+
- job_name: 'docker'
74+
dockerswarm_sd_configs:
75+
- host: unix:///var/run/docker.sock
76+
role: nodes
77+
relabel_configs:
78+
# Fetch metrics on port 9323.
79+
- source_labels: [__meta_dockerswarm_node_address]
80+
target_label: __address__
81+
replacement: $1:9323
82+
# Set hostname as instance label
83+
- source_labels: [__meta_dockerswarm_node_hostname]
84+
target_label: instance
85+
```
86+
87+
For the nodes role, you can also use the `port` parameter of
88+
`dockerswarm_sd_configs`. However, using `relabel_configs` is recommended as it
89+
enables Prometheus to reuse the same API calls across identical Docker Swarm
90+
configurations.
91+
92+
## Monitoring Containers
93+
94+
Let's now deploy a service in our Swarm. We will deploy [cadvisor][cad], which
95+
exposes container resources metrics:
96+
97+
```shell
98+
docker service create --name cadvisor -l prometheus-job=cadvisor \
99+
--mode=global --publish target=8080,mode=host \
100+
--mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock,ro \
101+
--mount type=bind,src=/,dst=/rootfs,ro \
102+
--mount type=bind,src=/var/run,dst=/var/run \
103+
--mount type=bind,src=/sys,dst=/sys,ro \
104+
--mount type=bind,src=/var/lib/docker,dst=/var/lib/docker,ro \
105+
google/cadvisor -docker_only
106+
```
107+
108+
This is a minimal `prometheus.yml` file to monitor it:
109+
110+
```yaml
111+
scrape_configs:
112+
# Make Prometheus scrape itself for metrics.
113+
- job_name: 'prometheus'
114+
static_configs:
115+
- targets: ['localhost:9090']
116+
117+
# Create a job for Docker Swarm containers.
118+
- job_name: 'dockerswarm'
119+
dockerswarm_sd_configs:
120+
- host: unix:///var/run/docker.sock
121+
role: tasks
122+
relabel_configs:
123+
# Only keep containers that should be running.
124+
- source_labels: [__meta_dockerswarm_task_desired_state]
125+
regex: running
126+
action: keep
127+
# Only keep containers that have a `prometheus-job` label.
128+
- source_labels: [__meta_dockerswarm_service_label_prometheus_job]
129+
regex: .+
130+
action: keep
131+
# Use the prometheus-job Swarm label as Prometheus job label.
132+
- source_labels: __meta_dockerswarm_service_label_prometheus_job
133+
target_label: job
134+
```
135+
136+
Let's analyze each part of the [relabel configuration][rela].
137+
138+
139+
```yaml
140+
- source_labels: [__meta_dockerswarm_task_desired_state]
141+
regex: running
142+
action: keep
143+
```
144+
145+
Docker Swarm exposes the desired [state of the tasks][state] over the API. In
146+
out example, we only **keep** the targets that should be running. It prevents
147+
monitoring tasks that should be shut down.
148+
149+
```yaml
150+
- source_labels: [__meta_dockerswarm_service_label_prometheus_job]
151+
regex: .+
152+
action: keep
153+
```
154+
155+
When we deployed our cadvisor, we have added a label `prometheus-job=cadvisor`.
156+
As Prometheus fetches the tasks labels, we can instruct it to **only** keep the
157+
targets which have a `prometheus-job` label.
158+
159+
160+
```yaml
161+
- source_labels: __meta_dockerswarm_service_label_prometheus_job
162+
target_label: job
163+
```
164+
165+
That last part takes the label `prometheus-job` of the task and turns it into
166+
a target label, overwriting the default `dockerswarm` job label that comes from
167+
the scrape config.
168+
169+
## Discovered labels
170+
171+
The [Prometheus Documentation][swarmsd] contains the full list of labels, but
172+
here are other relabel configs that you might find useful.
173+
174+
### Scraping metrics via a certain network only
175+
176+
```yaml
177+
- source_labels: [__meta_dockerswarm_network_name]
178+
regex: ingress
179+
action: keep
180+
```
181+
182+
### Scraping global tasks only
183+
184+
Global tasks run on every daemon.
185+
186+
```yaml
187+
- source_labels: [__meta_dockerswarm_service_mode]
188+
regex: global
189+
action: keep
190+
- source_labels: [__meta_dockerswarm_task_port_publish_mode]
191+
regex: host
192+
action: keep
193+
```
194+
195+
### Adding a docker_node label to the targets
196+
197+
```yaml
198+
- source_labels: [__meta_dockerswarm_node_hostname]
199+
target_label: docker_node
200+
```
201+
202+
## Connecting to the Docker Swarm
203+
204+
The above `dockerswarm_sd_configs` entries have a field host:
205+
206+
```yaml
207+
host: unix:///var/run/docker.sock
208+
```
209+
210+
That is using the Docker socket. Prometheus offers [additional configuration
211+
options][swarmsd] to connect to Swarm using HTTP and HTTPS, if you prefer that
212+
over the unix socket.
213+
214+
## Conclusion
215+
216+
There are many discovery labels you can play with to better determine which
217+
targets to monitor and how, for the tasks, there is more than 25 labels
218+
available. Don't hesitate to look at the "Service Discovery" page of your
219+
Prometheus server (under the "Status" menu) to see all the discovered labels.
220+
221+
The service discovery makes no assumptions about your Swarm stack, in such a way
222+
that given proper configuration, this should be pluggable to any existing stack.
223+
224+
[state]:https://docs.docker.com/engine/swarm/how-swarm-mode-works/swarm-task-states/
225+
[rela]:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
226+
[swarm]:https://docs.docker.com/engine/swarm/
227+
[swarmsd]:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config
228+
[dockermetrics]:https://docs.docker.com/config/daemon/prometheus/
229+
[cad]:https://github.com/google/cadvisor
230+
[setup]:https://prometheus.io/docs/prometheus/latest/getting_started/

0 commit comments

Comments
 (0)