|
1 | 1 |
|
2 |
| -Related |
3 |
| -------- |
| 2 | +============= |
| 3 | +RED in Python |
| 4 | +============= |
4 | 5 |
|
| 6 | +How-to |
| 7 | +====== |
| 8 | + |
| 9 | +1. Start the appliction, prometheus, alertmanager, and grafana with docker-compose: |
| 10 | + |
| 11 | + :: |
| 12 | + |
| 13 | + make docker_build |
| 14 | + make start |
| 15 | + |
| 16 | +2. Access the services: |
| 17 | + |
| 18 | + - 8080 - our application |
| 19 | + - 9090 - prometheus webgui |
| 20 | + - 9093 - alertmanager |
| 21 | + - 3000 - grafana |
| 22 | + |
| 23 | +2. Single calls, see targets with prefix *srv_* in `<Makefile>`_ |
| 24 | + |
| 25 | +3. Generate trafic (open grafana dashboard to see the metrics): |
| 26 | + |
| 27 | + :: |
| 28 | + |
| 29 | + make srv_wrk_random |
| 30 | + |
| 31 | +4. Questions: |
| 32 | + |
| 33 | + - What can we learn from the graphs? |
| 34 | + - Can we say sth about out random calls? |
| 35 | + - Naming? Is it good? |
| 36 | + |
| 37 | +Modifing the default configuration |
| 38 | +================================== |
| 39 | + |
| 40 | +Docker-compose mounts all configuration from the git repo. You can change it locally on your laptop. |
| 41 | + |
| 42 | +1. To reload prometheus configuration after changes: |
| 43 | + |
| 44 | + :: |
| 45 | + |
| 46 | + make prometheus_reload_config |
| 47 | + |
| 48 | +2. To reload grafana configuration, restart the grafana docker: |
| 49 | + |
| 50 | + :: |
| 51 | + |
| 52 | + docker restart talk-wrocpy-prom-flask_grafana_1 |
| 53 | + |
| 54 | +Development |
| 55 | +=========== |
| 56 | + |
| 57 | +- start the app and prometheus stack with docker-compose: |
| 58 | + |
| 59 | + :: |
| 60 | + |
| 61 | + make start |
| 62 | + |
| 63 | +- check the Makefile for example of calls |
| 64 | + |
| 65 | +- to use the traffic generator, you need to install first *wrk*: |
| 66 | + |
| 67 | + :: |
| 68 | + |
| 69 | + make srv_wrk_random |
| 70 | + |
| 71 | +Example of Prometheus Queries |
| 72 | +============================= |
| 73 | + |
| 74 | +- simple: |
| 75 | + |
| 76 | + :: |
| 77 | + |
| 78 | + order_mgmt_duration_seconds_sum{status_code='200'} |
| 79 | + |
| 80 | + :: |
| 81 | + |
| 82 | + order_mgmt_duration_seconds_sum{job=~".*"} |
| 83 | + or |
| 84 | + order_mgmt_database_duration_seconds_sum{job=~".*"} |
| 85 | + or |
| 86 | + order_mgmt_audit_duration_seconds_sum{job=~".*"} |
| 87 | + |
| 88 | +- based on weave blog (https://www.weave.works/blog/of-metrics-and-middleware/): |
| 89 | + |
| 90 | + - QPS: |
| 91 | + |
| 92 | + :: |
| 93 | + |
| 94 | + sum(irate(order_mgmt_duration_seconds_count{job=~".*"}[1m])) by (status_code) |
| 95 | + |
| 96 | + - will give you the rate of requests returning 500s: |
| 97 | + |
| 98 | + :: |
| 99 | + |
| 100 | + sum(irate(order_mgmt_duration_seconds_count{job=~".*", status_code=~"5.."}[1m])) |
| 101 | + |
| 102 | + - by status_code: |
| 103 | + |
| 104 | + :: |
| 105 | + |
| 106 | + sum(irate(order_mgmt_duration_seconds_count{job=~".*"}[1m])) by (status_code) |
| 107 | + |
| 108 | + - 500s: |
| 109 | + |
| 110 | + :: |
| 111 | + |
| 112 | + sum(irate(order_mgmt_duration_seconds_count{job=~".*", status_code=~"5.."}[1m])) |
| 113 | + |
| 114 | + - will give you the 5-min moving 99th percentile request latency: |
| 115 | + |
| 116 | + :: |
| 117 | + |
| 118 | + histogram_quantile(0.99, sum(rate(order_mgmt_duration_seconds_count{job=~".*",ws="false"}[5m])) by (le)) |
| 119 | + |
| 120 | +Related Work |
| 121 | +============ |
| 122 | + |
| 123 | +- https://prometheus.io/docs/prometheus/latest/querying/functions/ |
| 124 | +- https://www.robustperception.io/combining-alert-conditions/ |
5 | 125 | - https://github.com/prometheus/jmx_exporter
|
6 | 126 | - https://www.callicoder.com/spring-boot-actuator/
|
7 | 127 | - https://blog.kubernauts.io/https-blog-kubernauts-io-monitoring-java-spring-boot-applications-with-prometheus-part-1-c0512f2acd7b
|
0 commit comments