Skip to content

auto discover monitoring nodes such as prometheus node exporter, haproxy exporter, mysqld exporter etc. Then add them to prometheus target groups

Notifications You must be signed in to change notification settings

hercwey/prometheus-service-discovery

Repository files navigation

Overview

Auto discover monitoring node such as prometheus node exporter, haproxy exporter, mysqld exporter etc. Then add them to prometheus target groups.

This application contains two parts, ZkHttpServer and ServiceDiscoveryClient.

ZkHttpServer

When a node exporter start, it send a http post to ZkHttpServer, ZkHttpServer persist the post content to zookeeper. Post content contain two fields: port and data. Zk Node path name is in format of [srcip]:[port](1.1.1.1:9100). Zk Node’s data field is data field of the posted content.

ServiceDiscoveryClient

Scan zookeeper every 5 minutes by using crontab, if node path(in the format of ip:port, aka. 1.1.1.1:9100) is tcp reachable, then add it to prometheus tgroup json files. Prometheus watches these json files which define target groups. Whenever any of those files changes, a list of target groups is read from the files and scrape targets are extracted.

Here we use Prometheus Custom service discovery interface describe here.

Flowchart

MonitoringArchitecture.png

Prometheus configuration

# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 15s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    #
    static_configs:
      - targets: ['localhost:9090']

    file_sd_configs:
      - files: ['tgroups/*.json']

Node exporter startup script

#!bin/sh
# start node_exporter for prometheus with certain modules enabled

pid=`pidof node_exporter`
if [[ X"" = X"$pid" ]]; then
    echo "node_exporter not start"
else
    echo "killing node_exporter ..."
    kill -9 $pid
fi
workdir=`pwd`
cd $workdir
nohup ./node_exporter -collectors.enabled conntrack,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,textfile,time,uname,vmstat,tcpstat &
curl -i -X POST -d '{"port": "9100", "data": "linux"}' -H "Content-Type: application/json" http://172.16.10.50/zk

Node exporter shutdown script

#!bin/sh
# stop node_exporter for prometheus

pid=`pidof node_exporter`
if [[ X"" = X"$pid" ]]; then
    echo "node_exporter not start"
else
    echo "killing node_exporter ..."
    kill -9 $pid
fi
workdir=`pwd`
cd $workdir
curl -i -X DELETE -d '{"port": "9100", "data": "linux"}' -H "Content-Type: application/json" http://172.16.10.50/zk

Json output By ServiceDisvoeryClient

[ {
  "targets" : [ "172.16.6.102:9101", "172.16.11.151:9101", "172.16.11.152:9101", "172.16.6.101:9101" ],
  "labels" : {
    "job" : "haproxy"
  }
}, {
  "targets" : [ "172.16.11.151:9100", "172.16.6.112:9100", "172.16.11.152:9100", "172.16.6.116:9100", "172.16.11.184:9100", "172.16.6.102:9100", "172.16.11.182:9100", "172.16.11.197:9100", "172.16.6.115:9100", "172.16.6.113:9100", "172.16.6.109:9100", "172.16.11.181:9100", "172.16.11.194:9100", "172.16.6.104:9100", "172.16.6.151:9100", "172.16.6.101:9100", "172.16.11.183:9100", "172.16.11.185:9100", "172.16.6.103:9100", "172.16.6.152:9100", "172.16.11.3:9100", "172.16.6.110:9100", "172.16.6.111:9100" ],
  "labels" : {
    "job" : "linux"
  }
}, {
  "targets" : [ "172.16.6.112:9104", "172.16.11.184:9104", "172.16.11.183:9104", "172.16.6.111:9104" ],
  "labels" : {
    "job" : "mysql"
  }
} ]

Setup

ZkHttpServer

Build

Run maven clean package to build jar file. Before building, plese change application.properties to suit your environment.

Target machine

You can run ZkHttpServer on any machine with Oracle Java version 1.7 or above. ZkHttpServer on multiple machine is supported, which behind a load balancer.

cd /home/deploy
nohup /usr/bin/java -jar prometheus-zk-httpserver-0.0.1-RELEASE.jar  &

ServiceDiscoverClient

Build

Run maven clean package to build jar file. Before building, plese change application.properties to suit your environment.

Target machine

Oracle Java version 1.7 or above is needed on target machine. You should put runnable jar file on machine which install prometheus server. Typical setup is write a shell script, and run through crontab every 5 minutes.

cd /opt/prometheus-0.20.0.linux-amd64/tgroups
nohup /usr/bin/java -jar prometheus-service-discovery-client-0.0.1-RELEASE.jar  &

About

auto discover monitoring nodes such as prometheus node exporter, haproxy exporter, mysqld exporter etc. Then add them to prometheus target groups

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages