Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for GPU Clock Frequencies (sclk and mclk) in hwmon Collector #3093

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

zxhdaze
Copy link

@zxhdaze zxhdaze commented Aug 22, 2024

Hi team,

I have added a new feature to the hwmon collector that enables the monitoring of GPU clock frequencies, specifically sclk (Shader Clock) and mclk (Memory Clock).

The hwmon collector now reads the following data from the /sys/class/hwmon directory:

/sys/class/hwmon/hwmon0/freq1_input:214000000
/sys/class/hwmon/hwmon0/freq1_label:sclk
/sys/class/hwmon/hwmon0/freq2_input:300000000
/sys/class/hwmon/hwmon0/freq2_label:mclk

and generate following metrics:

# HELP node_hwmon_freq_freq_mhz Hardware monitor for GPU frequency in MHz
# TYPE node_hwmon_freq_freq_mhz gauge
node_hwmon_freq_freq_mhz{chip="0000:9d:00_0_0000:9e:00_0",sensor="mclk"} 300
node_hwmon_freq_freq_mhz{chip="0000:9d:00_0_0000:9e:00_0",sensor="sclk"} 214

Please let me know if there are any questions or if further adjustments are needed.

Thank you!

closes:#3092

@zxhdaze zxhdaze marked this pull request as ready for review August 22, 2024 21:01
@aieri
Copy link

aieri commented Aug 23, 2024

here's the relevant kernel documentation reference: https://docs.kernel.org/gpu/amdgpu/thermal.html

@jneo8
Copy link

jneo8 commented Aug 23, 2024

Hi, I know there is no unit test for hwmon_linux now. Is that possible we provide unit test for it?

@SuperQ
Copy link
Member

SuperQ commented Aug 23, 2024

You can add a test file to the fixtures. This way it will be tested by the end-to-end-test.sh script.

collector/hwmon_linux.go Outdated Show resolved Hide resolved
@zxhdaze
Copy link
Author

zxhdaze commented Aug 23, 2024

We're happy to provide tests for this if you'd like, but we'll leave it up to you (upstream) to decide whether you want to include them or keep it as is.

Signed-off-by: Xuhui Zhu <simon.zhu@canonical.com>
Signed-off-by: Xuhui Zhu <simon.zhu@canonical.com>
Signed-off-by: Xuhui Zhu <simon.zhu@canonical.com>
Signed-off-by: Xuhui Zhu <simon.zhu@canonical.com>
Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@aieri
Copy link

aieri commented Sep 23, 2024

Thanks for reviewing! Could someone with write access run the workflows, please?

@Deezzir
Copy link

Deezzir commented Sep 24, 2024

Thanks for reviewing! Could someone with write access run the workflows, please?

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants