Skip to content

Commit ef841d5

Browse files
authored
Snmp observ lib (#1382)
* Add snmp-observ-lib (init) * Add system * Add packets * Add irate for interface * Update dashboards * Add main * Add clamp * Errors and drops should be > 0 * Add nameShort for legends * Move legend to the right for traffic * Make snnmp host single * Add aggKeepLabels * Add interfaceTable * Update packet panel types * Add systemname * Add new interface signals * Add row * Update interface signals and panels * Add cpu signals for multiple vendors * Add memory signals for multiple vendors * Remove txt file * Add system signals * Add README * Add rows/dashboards * Add config * Add fleet dashboard * Add withTopK to panels * Simplify clampQuery * Add links between dashboards: from table, as datalinks, global * add mininterval option * Update README * Fixed ifLastChange * Fmt * Update cpu/memory to avg * Add SNMP alerts * Add alerts thresholds * Add SNMPInterfaceIsFlapping alert * Add snmp-exporter-alerts * add metricsSource * Update counters for ubuquiti airos * Update README * update 0 PDU alert * Rename arista to arista_Sw * Fix add cpu/memory signals for generic (and arista) * Fix mikrotik memory * Fix selector * Update default group/instance labels * Fix lint * Update juniper cpu/mem * rename to dell_network * Add sensors signals * Add FibreChannel alerts * Update readme * Add logs signals, alerts signals and logs dashboard * Fix logs lib import * Updated readme * Update readme * Add comments * Updated to templated interval * whitespace fmt * Add all errors and drops aggregation * Drop columns from interface table * Update alerts * Provide readme instructions * Update Makefile * Movr SNMP to default prefix * Update prefix * Update README and instructions. Add screenshots * Fix screenshot
1 parent 393630c commit ef841d5

36 files changed

+3487
-2
lines changed

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,11 @@ More examples:
4848
Examples:
4949
- [kafka-observ-lib](kafka-observ-lib/)
5050
- [jvm-observ-lib](jvm-observ-lib/)
51+
- [snmp-observ-lib](snmp-observ-lib/)
5152
- [process-observ-lib](process-observ-lib/)
5253
- [golang-observ-lib](golang-observ-lib/)
5354
- [csp-mixin](csp-mixin/)
5455

55-
56-
5756
## LICENSE
5857

5958
[Apache-2.0](LICENSE)

snmp-observ-lib/.lint

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
exclusions:
2+
template-instance-rule:
3+
reason: "These dashboards are designed to be single instance"
4+
entries:
5+
- dashboard: SNMP overview

snmp-observ-lib/Makefile

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
JSONNET_FMT := jsonnetfmt -n 2 --max-blank-lines 1 --string-style s --comment-style s
2+
3+
.PHONY: all
4+
all: build dashboards_out prometheus_alerts.yaml
5+
6+
vendor: jsonnetfile.json
7+
jb install
8+
9+
.PHONY: build
10+
build: vendor
11+
12+
.PHONY: fmt
13+
fmt:
14+
find . -name 'vendor' -prune -o -name '*.libsonnet' -print -o -name '*.jsonnet' -print | \
15+
xargs -n 1 -- $(JSONNET_FMT) -i
16+
17+
.PHONY: lint
18+
lint: build
19+
find . -name 'vendor' -prune -o -name '*.libsonnet' -print -o -name '*.jsonnet' -print | \
20+
while read f; do \
21+
$(JSONNET_FMT) "$$f" | diff -u "$$f" -; \
22+
done
23+
mixtool lint mixin.libsonnet
24+
25+
dashboards_out: mixin.libsonnet main.libsonnet config.libsonnet $(wildcard panels/*) $(wildcard signals/*) dashboards.libsonnet annotations.libsonnet
26+
@mkdir -p dashboards_out
27+
mixtool generate dashboards mixin.libsonnet -d dashboards_out
28+
29+
prometheus_alerts.yaml: mixin.libsonnet alerts.libsonnet $(wildcard signals/*)
30+
mixtool generate alerts mixin.libsonnet -a prometheus_alerts.yaml
31+
32+
33+
.PHONY: deploy deploy_rules deploy_dashboards
34+
deploy: deploy_rules deploy_dashboards
35+
36+
deploy_dashboards:
37+
ifdef GRAFANA_URL
38+
grr apply mixin.libsonnet --target "Dashboard.*" --target "DashboardFolder.*"
39+
else
40+
$(warning GRAFANA_URL is not set, skipping grafana dashboards deployment)
41+
endif
42+
43+
deploy_rules:
44+
ifdef CORTEX_ADDRESS
45+
grr apply mixin.libsonnet --target "PrometheusRuleGroup.*"
46+
else
47+
$(warning CORTEX_ADDRESS is not set, skipping prometheus alerts deployment)
48+
endif
49+
50+
.PHONY: clean
51+
clean:
52+
rm -rf dashboards_out prometheus_alerts.yaml

snmp-observ-lib/README.md

Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
# SNMP observability library
2+
3+
This lib can be used to generate dashboards, rows, panels for SNMP devices.
4+
5+
The library supports multiple metrics sources that corresponds to different network vendors.
6+
7+
### Supported sources
8+
9+
|metricsSource|Description|MIBs|Known devices|Links|snmp_exporter modules|
10+
|-|-|-|-|-|-|
11+
|generic |Generic SNMP device|IF-MIB,SNMPv2-MIB |default choice||system,if_mib,hrDevice,hrStorage|
12+
|cisco | Cisco IoS devices |IF-MIB,SNMPv2-MIB, Cisco private mibs|Cisco C2900, Cisco C7600, Cisco MDS|-|system,if_mib,cisco_device,cisco_fc_fe|
13+
|arista_sw | Arista devices |IF-MIB,SNMPv2-MIB,HOST-RESOURCES-MIB|-||system,if_mib,hrDevice,hrStorage,arista_sw|
14+
|brocade_fcs | Brocade |IF-MIB,SNMPv2-MIB,SW-MIB|Brocade 6520 v7.4.1c, Brocade 300 v7.0.0c,Brocade BL 5480 v6.3.1c|https://techdocs.broadcom.com/us/en/fibre-channel-networking/fabric-os/fabric-os-mib/9-1-x/understanding-brocade-snmp/loading-brocade-mibs/brocade-mib-files.html|system,if_mib|
15+
|brocade_foundry | Brocade Foundry | FOUNDRY-SN-AGENT-MIB | Brocade MLXe (System Mode: MLX), IronWare Version V5.4.0eT163, Foundry FLS648 Foundry Networks, Inc. FLS648, IronWare Version 04.1.00bT7e1, Foundry FWSX424 Foundry Networks, Inc. FWSX424, IronWare Version 02.0.00aT1e0||system,if_mib|
16+
|dell_network | Dell Force S-Series, Dell Force10 MXL 10 | IF-MIB,SNMPv2-MIB,DELL-NETWORKING-CHASSIS-MIB | Dell Force S-Series |https://www.dell.com/support/kbdoc/en-us/000181922/dell-networking-mibs|system,if_mib,dell_network|
17+
|dlink_des | D-Link DES series | IF-MIB,SNMPv2-MIB,AGENT-GENERAL-MIB | DGS-3420-26SC Gigabit Ethernet ||system,if_mib,dlink|
18+
|eltex_mes | Eltex MES | IF-MIB, SNMPv2-MIB,ELTEX-MES-ISS-CPU-UTIL-MIB,ARICENT-ISS-MIB | MES 2448P ||system,if_mib,eltex_mes|
19+
|extreme | ExtremeXOS | IF-MIB, SNMPv2-MIB,EXTREME-SYSTEM-MIB, EXTREME-SOFTWARE-MONITOR-MIB | - ||system,if_mib|
20+
|f5_bigip | F5 BigIP | IF-MIB,SNMPv2-MIB,F5-BIGIP-SYSTEM-MIB | - |https://my.f5.com/manage/s/article/K13322|system,if_mib|
21+
|fortigate | Fortinet Fortigate | IF-MIB,SNMPv2-MIB,FORTINET-FORTIGATE-MIB,ENTITY-MIB | v7.2.5 ||system,if_mib,hrDevice,hrStorage|
22+
|hpe | HP Enterprise Switches | IF-MIB,SNMPv2-MIB,STATISTICS-MIB,NETSWITCH-MIB | HP ProCurve J4900B, HP J9728A 2920-48G | https://support.hpe.com/hpesc/public/docDisplay?sp4ts.oid=51079&docId=emr_na-c02597344|system,if_mib|
23+
|huawei | Huawei VRP | IF-MIB,SNMPv2-MIB,HUAWEI-ENTITY-EXTENT-MIB | - |https://support.huawei.com/enterprise/en/doc/EDOC1000178181/2f6c0513/mib-overview |system,if_mib|
24+
|juniper | Juniper MX, Juniper SRX | IF-MIB,SNMPv2-MIB,JUNIPER-MIB,JUNIPER-ALARM-MIB | Juniper MX204 Edge Router, JUNOS 24.2R1-S1.10, Juniper SRX, Juniper EX4200-24| https://www.juniper.net/documentation/us/en/software/nce/nce-srx-cluster-management-best/topics/concept/chassis-cluster-performance-monitoring.html |system,if_mib|
25+
|mikrotik | Mikrotik OS | HOST-RESOURCES-MIB,SNMPv2-MIB,MIKROTIK-MIB,IF-MIB | Router OS 7.3 |912UAG-5HPnD,941-2nD,1100ahx2,CCR1016-12G,CCR1036-12G-4S,rb2011ua,mikrotik450g,mikrotikrb1100ah|system,if_mib,mikrotik,hrStorage,hrDevice|
26+
|netgear | Netgear FastPath switches | SNMPv2-MIB,FASTPATH-SWITCHING-MIB,FASTPATH-BOXSERVICES-PRIVATE-MIB,IF-MIB | Netgear M5300-28G | https://kb.netgear.com/24352/MIBs-for-Smart-switches |system,if_mib,netgear|
27+
|qtech | QTech | QTECH-MIB,EtherLike-MIB,HOST-RESOURCES-MIB,SNMPv2-MIB,ENTITY-MIB,IF-MIB | | |system,if_mib|
28+
|tplink | TP-LINK | TPLINK-SYSINFO-MIB,HOST-RESOURCES-MIB,SNMPv2-MIB,TPLINK-SYSMONITOR-MIB,IF-MIB | T2600G-28TS | https://www.tp-link.com/en/support/download/t2600g-28ts/#MIBs_Files https://www.tp-link.com/ru/support/faq/1330/ |system,if_mib|
29+
|ubiquiti_airos | Ubiquiti AirOS | FROGFOOT-RESOURCES-MIB,HOST-RESOURCES-MIB,SNMPv2-MIB,IEEE802dot11-MIB,IF-MIB | NanoStation M5, UAP-LR | |system,if_mib,ubiquiti_airos|
30+
31+
32+
## Usage
33+
34+
For detailed usage examples see [helloworld-observ-lib README](../helloworld-observ-lib/README.md).
35+
36+
### Import as a library
37+
38+
Import into another library or mixin:
39+
40+
```sh
41+
jb init
42+
jb install https://github.com/grafana/jsonnet-libs/snmp-observ-lib
43+
```
44+
45+
Add jsonnet file:
46+
```
47+
local snmplib = import 'snmp-observ-lib/main.libsonnet';
48+
local snmp =
49+
snmplib.new()
50+
+ snmplib.withConfigMixin(
51+
{
52+
//override default configs:
53+
filteringSelector: 'job!=""',
54+
groupLabels: ['zone'],
55+
instanceLabels: ['target'],
56+
uid: 'snmp-sample',
57+
dashboardNamePrefix: 'Network',
58+
dashboardTags: ['networking'],
59+
// pick vendors you have:
60+
metricsSource: ['juniper','mikrotik'],
61+
enableLokiLogs: false,
62+
}
63+
);
64+
snmp.asMonitoringMixin()
65+
```
66+
67+
68+
### As monitoring-mixin
69+
70+
You can quickly generate dashboards and alerts by using monitoring-mixin mixin.libsonnet:
71+
72+
- Adjust config.libsonnet to your needs. For example, pick metricSources that correspond to network vendors you use on your network.
73+
- Run
74+
```
75+
make dashboards_out
76+
make prometheus_alerts.yaml
77+
```
78+
79+
80+
### Logs support
81+
82+
Note: Logs support is enabled by default. To opt-out, set `enableLokiLogs: false` in the config before generating dashboards from this library.
83+
84+
This SNMP observability library can show syslog messages collected by alloy/rsyslog agents and stored in Grafana Loki.
85+
86+
In order to get syslog messages you need to do the following (example for cisco syslog):
87+
88+
1. Setup rsyslog agent with the following rsyslog.conf:
89+
90+
```
91+
module(load="imudp")
92+
#https://www.rsyslog.com/doc/master/configuration/modules/pmciscoios.html
93+
module(load="pmciscoios")
94+
# Pick your port to taste
95+
input(type="imudp" port="30514" ruleset="withOrigin")
96+
timezone(id="<yourtimezone>" offset="00:00")
97+
# instead of -x
98+
global(net.enableDNS="off")
99+
100+
$template raw,"%msg:2:2048%\n"
101+
102+
ruleset(name="common") {
103+
# Forward everything
104+
if ($fromhost-ip != "127.0.0.1" ) then action(type="omfwd"
105+
protocol=tcp target=localhost port=30514
106+
Template="RSYSLOG_SyslogProtocol23Format"
107+
TCP_Framing="octet-counted" KeepAlive="on"
108+
action.resumeRetryCount="-1"
109+
queue.type="linkedlist" queue.size="50000")
110+
*.* /dev/stdout; raw
111+
}
112+
113+
ruleset(name="withoutOrigin" parser="rsyslog.ciscoios") {
114+
/* this ruleset uses the default parser which was
115+
* created during module load
116+
*/
117+
call common
118+
}
119+
120+
parser(name="custom.ciscoios.withOrigin" type="pmciscoios"
121+
present.origin="on")
122+
ruleset(name="withOrigin" parser="custom.ciscoios.withOrigin") {
123+
/* this ruleset uses the parser defined immediately above */
124+
call common
125+
}
126+
```
127+
128+
2. Setup alloy agent with the following snippet (adjust to your setup):
129+
130+
```
131+
// LOGS
132+
loki.write "default" {
133+
endpoint {
134+
url = "loki:3100"
135+
}
136+
}
137+
138+
loki.source.api "default" {
139+
http {
140+
listen_address = "0.0.0.0"
141+
listen_port = 3500
142+
}
143+
forward_to = [
144+
loki.process.limit.receiver,
145+
]
146+
}
147+
loki.process "limit" {
148+
stage.limit {
149+
rate = 10000
150+
burst = 20000
151+
drop = drop
152+
by_label_name = "hostname"
153+
}
154+
forward_to = [
155+
loki.write.default.receiver,
156+
]
157+
}
158+
159+
160+
// SYSLOG specific:
161+
loki.source.syslog "default" {
162+
listener {
163+
address = "0.0.0.0:30514"
164+
protocol = "tcp"
165+
use_incoming_timestamp = true
166+
labels = { job = "syslog" }
167+
}
168+
169+
forward_to = [loki.process.syslog.receiver]
170+
relabel_rules = loki.relabel.syslog.rules
171+
}
172+
173+
loki.relabel "syslog" {
174+
forward_to = []
175+
176+
rule {
177+
source_labels = ["__syslog_message_hostname"]
178+
target_label = "sysname"
179+
}
180+
rule {
181+
source_labels = ["__syslog_message_hostname"]
182+
target_label = "instance"
183+
}
184+
rule {
185+
source_labels = ["__syslog_message_app_name"]
186+
target_label = "syslog_app_name"
187+
}
188+
rule {
189+
source_labels = ["__syslog_message_severity"]
190+
target_label = "level"
191+
}
192+
rule {
193+
source_labels = ["__syslog_message_facility"]
194+
target_label = "facility"
195+
}
196+
rule {
197+
source_labels = ["__syslog_message_msg_id"]
198+
target_label = "syslog_msg_id"
199+
}
200+
}
201+
//cisco_rfc3164_logs
202+
loki.process "syslog" {
203+
stage.match {
204+
// match only cisco unparsed logs like https://regex101.com/r/v0MyiB/6
205+
// from ASA or NX-OS
206+
selector = `{instance!=""} |~ "<\\d+>.+%.+"`
207+
stage.regex {
208+
expression = `<\d+?>((?P<sysname>[a-zA-Z0-9\-\.]+):)?(?P<date_and_other>.+): (?P<appname>%.+?): (?P<msg>.+)`
209+
}
210+
stage.labels {
211+
values = {
212+
sysname = "",
213+
syslog_app_name = "appname",
214+
}
215+
}
216+
stage.output {
217+
source = "msg"
218+
}
219+
}
220+
221+
forward_to = [loki.process.limit.receiver]
222+
}
223+
224+
```
225+
226+
3. Setup syslog at the device side according to vendor docs
227+
228+
For cisco devices, set origin option: `logging origin-id hostname`.
229+
230+
4. Get syslog messages on the separate dashboard and as dashboard annotations for critical events collected.
231+
232+
## Examples
233+
SNMP fleet:
234+
![fleet](snmp_fleet.png)
235+
236+
SNMP overview:
237+
![overview](snmp_overview.png)
238+
239+
SNMP logs:
240+
![logs](snmp_logs.png)

0 commit comments

Comments
 (0)