Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hwmon: Provide annotation metric to link chip sysfs paths to human-readable chip types #359

Merged

Conversation

horazont
Copy link
Contributor

@horazont horazont commented Nov 29, 2016

The chip label generation has been changed in #334 to prefer the unique device path (e.g. the location on the PCI bus) due to #333.

After discussion in #355, we learnt that the way forward is to provide an annotation metric (instead of a new label on the existing metrics). The annotation metric allows to link the unique chip sysfs path to a human-readable chip type which may not be unique among chip sysfs paths (for example, dual-slot systems have multiple chipType="coretemp" sensors).

This allows to mitigate the downsides of the solution to #333 (namely that the device path may not be stable across kernels and reboots) for cases where it does not matter that multiple devices may have the same human-readable type (e.g. aggregation or where at most one device of a type is present).

For cases where no human-readable type can be derived, the annotation metric is not emitted.

(Note: The below examples have been modified to match the most recent commits in this pull request.)

The annotation metric can be used in queries like this:

  1. Find the voltages of a nouveau-driven GPU:

     node_hwmon_in_volts and on (chip) node_hwmon_chip_names{chip_name="nouveau"}
    
  2. Find the temperatures of a GPU using the amdgpu driver:

     node_hwmon_temp_celsius and on (chip) node_hwmon_chip_names{chip_name="amdgpu"}
    
  3. Find all CPU temperatures:

     node_hwmon_temp_celsius and on (chip) node_hwmon_chip_names{chip_name="coretemp"}
    

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@brian-brazil
Copy link
Contributor

brian-brazil commented Nov 29, 2016 via email

@SuperQ
Copy link
Member

SuperQ commented Nov 29, 2016

Do these map in any way to kernel module names?

@horazont
Copy link
Contributor Author

horazont commented Nov 29, 2016

I am not invested in any naming here, I just found that "type" seems more fitting than "name" (it is not the name of the chip, but some kind of type is fitting).

I’m not sure how those relate to kernel modules. It might be that those are kernel module names, at least I can find "coretemp" and "nouveau" loaded on my machine and "amdgpu" on another.

I’m happy to adjust the commits when a consensus is reached. I don’t have a strong opinion in any direction.

The sysfs documentation states:

name            The chip name.
                This should be a short, lowercase string, not containing
                spaces nor dashes, representing the chip name. This is
                the only mandatory attribute.
                I2C devices get this attribute created automatically.
                RO

@SuperQ
Copy link
Member

SuperQ commented Nov 29, 2016

chip_name works for me.

The chip label generation has been changed in prometheus#334 to prefer the
unique device path (e.g. the location on the PCI bus) due to prometheus#333.

Here, a new annotation metric ``node_hwmon_chip_names`` is
introduced which allows to link the unique chip sysfs path to a
human-readable chip name which may not be unique among chip sysfs
paths (for example, dual-slot systems have multiple
chipType="coretemp" sensors).

This allows to mitigate the downsides of the solution to prometheus#333
(namely that the device path may not be stable across kernels and
reboots) for cases where it does not matter that multiple devices
may have the same human-readable name (e.g. aggregation or where
at most one device with a common chip name is present).

For cases where no human-readable name can be derived, the
annotation metric is not emitted.
@horazont horazont force-pushed the feature/hwmon_chip_name_metric branch from 11020d1 to 3efaa1a Compare December 1, 2016 09:01
@horazont
Copy link
Contributor Author

horazont commented Dec 1, 2016

@brian-brazil @SuperQ Done! It’s chip_name / chip_names now.

Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small nitpicks, beside that it looks good.

@@ -143,6 +144,26 @@ func (c *hwMonCollector) updateHwmon(ch chan<- prometheus.Metric, dir string) (e
}
}

hwmonChipName, err := c.hwmonHumanReadableChipName(dir)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove newline, it's uncommon to have a newline between returning and error and checking it.

@@ -351,6 +372,27 @@ func (c *hwMonCollector) hwmonName(dir string) (string, error) {
return "", errors.New("Could not derive a monitoring name for " + dir)
}

func (c *hwMonCollector) hwmonHumanReadableChipName(dir string) (string, error) {
// this is similar to the methods in hwmonName, but with different
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this above the function and start with "hwmonHumanReadableChipName is similar to ...."?

@discordianfish
Copy link
Member

I'll just fix the comments etc myself. Thanks for your contribution!

@discordianfish discordianfish merged commit b68a9ec into prometheus:master Jan 3, 2017
@SuperQ SuperQ mentioned this pull request Jan 15, 2017
@grobie grobie mentioned this pull request Mar 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants