-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect CPU temperatures on FreeBSD #397
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also needs rebasing on the ConstMetrics change
@dominikh ping |
Pong. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added two comments. Ideally we should also have tests for these, but that'll require a better test suite first with OS specific tests.
temp, err := unix.SysctlUint32(fmt.Sprintf("dev.cpu.%d.temperature", cpu)) | ||
if err == nil { | ||
ftemp := float64(temp-2732) / 10 | ||
c.temp.mustNewConstMetric(ftemp, lcpu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result needs to be sent to the channel.
ch <- c.cpu.mustNewConstMetric(float64(t.idle), lcpu, "idle") | ||
|
||
temp, err := unix.SysctlUint32(fmt.Sprintf("dev.cpu.%d.temperature", cpu)) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the error case, I think we should still set the metric, probably best to NaN
and log the error at least at debug level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most common error here will be that there's no temperature at all, either because the CPU isn't supported or because the correct driver (coretemp or k8temp) wasn't loaded. Should that still be logged at debug level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd expect such information as debug output to understand why metrics are missing.
Ideally we could distinguish temporary errors from unsupported errors. Exporting NaN would make the first case a lot clearer, but not setting the metric at all makes probably more sense in the second case.
PTAL. We're logging errors now, setting NaN for unexpected errors, and actually sending the metric to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! LGTM once the right log package is used.
"strconv" | ||
"unsafe" | ||
|
||
"github.com/prometheus/client_golang/prometheus" | ||
"github.com/prometheus/log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prometheus/log is deprecated and has been moved to prometheus/common/log. Only the new package has been vendored in node_exporter.
Log why we couldn't collect the temperature, and set metric to NaN if the CPU should support temperature collection but had an error.
Fixed and squashed, and given goimports a stern look. |
Nice, thanks! |
Prometheus node exporter checks the returned error variable of `NVMeClass` for being `os.ErrNotExist`. This check will never match, because `NVMeClass` uses `fmt.Errorf` to convert the error. So wrap the error result using `%w` to allow checking for `os.ErrNotExist`. Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
No description provided.