-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: consider adopting procfs lib FS.Meminfo() for memory collector #2957
Comments
I don't think we should marshal them to a map though, we should finally make the meminfo metrics follow more the best pratices. E.g using labels for metrics that can be summed up. For that I'd suggest creating a new meminfo collector and deprecate the old one, then in a next major release enabled the new one by default and disable the deprecated one. @SuperQ wdyt? |
Thanks @rexagod! I was mostly waiting on the green light to proceed, I'm still willing to take this on. However, I would be very happy/grateful if you would be willing to help review the PR once it's pushed and/or PR against my branch if you want to collaborate more. Initial thoughts/questions for feedback:
(sorry for the stream of consciousness, like I said, initial thoughts 🙃 ) |
If it's a new collector, it can be disabled/enabled - so no 'feature flag' specifically
If we can support the other OSes with the new collector, cool - if not, we can add support for that later.
Yes, I'd say we make the collectors mutually exclusive so you can use the same metric names where it makes sense
The general best practices apply, so yeah we shouldn't mix counters and gauges. Only things where sum() makes sense should be labels in the same metric.
Thats why I suggest a new collector (and mark the old one deprecated eventually), downstream projects can still use the old one but get warnings that it is deprecated |
I'd be happy to review your PR, @tjhop! Feel free to tag me there once its up! Godspeed! 👋🏼 |
Part of prometheus#2957 Adds a new collector named `meminfo_procfs` that exposes memory metrics in a format that attemps to be more inline with upstream conventions -- memory metrics are now exposed under a single metric named `node_memory_bytes` that has a single label called `field`, corresponding to the name of the field in `/proc/meminfo` that value represents. Label values for the `field` label are named according to the struct field in the procfs.Meminfo struct, and the values always use the byte-normalized counterpart fields in the procfs.Meminfo struct, resulting in a transition such as the following: `node_memory_Active_anon_bytes -> node_memory_bytes{field="ActiveAnon"}` Notes: currently linux only, as that is the focus of the procfs lib. once consensus has been reached here on new metric name/labels/format, I can expand coverage for darwin/openbsd/netbsd and forward-port the existing meminfo collector for those platforms to use the updated format. Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
This is actually quite intentional, and the recommended way to do things in Go. Struct breakage is explicit at compile time, so it's quite stable. Dynamic mapping, while common and convenient for the developer, is fragile. I much prefer explicit struct-to-metric mapping like is done in other collectors. For example, take a look at the I don't see a major need to create a new collector. Just convert the existing dynamic mapping to an explicit mapping. |
Depends on whether we want to fix/change the metric names |
@SuperQ I've grown to agree with you since my last comment, re: explicit struct mapping and have taken that approach in the PR. I'm happy to re-scope #3043 to just refactoring the existing |
Yea, let's just do the minimal migration and do any metric renaming as a separate task. Thanks! |
Part of prometheus#2957 Prometheus' procfs lib supports collecting memory info and we're using a new enough version of the lib that has it available, so this converts the meminfo collector for Linux to use data from procfs lib instead. The bits I've touched for darwin/openbsd/netbsd are with intent to preserve the original struct implementation/backwards compatibility. Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Part of prometheus#2957 Prometheus' procfs lib supports collecting memory info and we're using a new enough version of the lib that has it available, so this converts the meminfo collector for Linux to use data from procfs lib instead. The bits I've touched for darwin/openbsd/netbsd are with intent to preserve the original struct implementation/backwards compatibility. Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
* ref!: convert linux meminfo implementation to use procfs lib Part of #2957 Prometheus' procfs lib supports collecting memory info and we're using a new enough version of the lib that has it available, so this converts the meminfo collector for Linux to use data from procfs lib instead. The bits I've touched for darwin/openbsd/netbsd are with intent to preserve the original struct implementation/backwards compatibility. Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> * fix: meminfo debug log unsupported value Fixes: ``` ts=2024-06-11T19:04:55.591Z caller=meminfo.go:44 level=debug collector=meminfo msg="Set node_mem" memInfo="unsupported value type" ``` Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> * fix: don't coerce nil Meminfo entries to 0, leave out if nil Nil entries in procfs.Meminfo fields indicate that the value isn't present on the system. Coercing those nil values to `0` introduces new metrics on systems that should not be present and can break some queries. Addresses PR feedback: #3049 (comment) #3049 (comment) Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> --------- Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
Circling back to this -- the node exporter has been updated to use procfs lib for the meminfo collector, so I believe the core of this issue is complete. Are we ok with opening a new issue if/when it's time to discuss renaming the metrics? |
…eus#3049) * ref!: convert linux meminfo implementation to use procfs lib Part of prometheus#2957 Prometheus' procfs lib supports collecting memory info and we're using a new enough version of the lib that has it available, so this converts the meminfo collector for Linux to use data from procfs lib instead. The bits I've touched for darwin/openbsd/netbsd are with intent to preserve the original struct implementation/backwards compatibility. Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> * fix: meminfo debug log unsupported value Fixes: ``` ts=2024-06-11T19:04:55.591Z caller=meminfo.go:44 level=debug collector=meminfo msg="Set node_mem" memInfo="unsupported value type" ``` Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> * fix: don't coerce nil Meminfo entries to 0, leave out if nil Nil entries in procfs.Meminfo fields indicate that the value isn't present on the system. Coercing those nil values to `0` introduces new metrics on systems that should not be present and can break some queries. Addresses PR feedback: prometheus#3049 (comment) prometheus#3049 (comment) Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> --------- Signed-off-by: TJ Hoplock <t.hoplock@gmail.com> Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
As of #2952, the node exporter has been bumped to use procfs lib v0.13.0, which has a fix for safer meminfo parsing from
/proc/meminfo
. This means it's possible to move away from the custom meminfo parsing the node exporter currently does and use the updated library's parsing instead.Considerations:
The node exporter memory collector's
Update()
func uses and expects memory info to be returned as amap[string]float64
from the various platform implementations, which means that even if we adopt the library's updated memory info parsing, we would then need to convert the struct into the expected map type. This can be done with a quick json Marshal/Unmarshal dance playground, if we're willing to pullencoding/json
in as a dependency. I'd really rather avoid manually/explicitly parsing out the struct fields as it feels fragile and prone to breakage on procfs updates, so ideas welcome.I'm willing to implement the changes if the concepts here are accepted 👍
The text was updated successfully, but these errors were encountered: