You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@levindecaro Your expected metric would be inaccurate, because it's not the whole md125 array that has been removed, but rather just one of the component devices. From the output of your mdadm command, the md125 array is still functioning (and would continue to do so, since it's raid1 and still has one leg working).
What you instead need is a metric for the state of individual component devices if you want to see if they have been removed.
However, you could also have alerted on the condition that you encountered with a node_md_disks{state="failed"} > 0 alerting rule. Alternatively, node_md_disks_required - node_md_disks{state="active"} > 0 would probably also do the trick.
Having said that, the existing implementation of the procfs library's parsing of /proc/mdstat masks some of the low-level details and this is why I have proposed a new direction with prometheus/procfs#509.
Host operating system: output of
uname -a
Linux sds-3 4.18.0-305.7.1.el8_4.x86_64 #1 SMP Tue Jun 29 21:55:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
node_exporter version: output of
node_exporter --version
node_exporter, version 1.3.1 (branch: HEAD, revision: a2321e7)
build user: root@243aafa5525c
build date: 20211205-11:09:49
go version: go1.17.3
platform: linux/amd64
node_exporter command line flags
Are you running node_exporter in Docker?
no
What did you do that produced an error?
mdadm -D output
What did you expect to see?
node_md_state{device="md125", instance="sds-3", job="sds-nodes", state="removed"}
What did you see instead?
"removed" state metric not yet implemented in node_md_state
The text was updated successfully, but these errors were encountered: