Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inputs.systemd_units] cpu usage goes through the roof! #15104

Closed
electrofloat opened this issue Apr 4, 2024 · 9 comments · Fixed by #15108
Closed

[inputs.systemd_units] cpu usage goes through the roof! #15104

electrofloat opened this issue Apr 4, 2024 · 9 comments · Fixed by #15108
Assignees
Labels
bug unexpected problem or unintended behavior regression something that used to work, but is now broken

Comments

@electrofloat
Copy link

Relevant telegraf.conf

[[inputs.systemd_units]]

Logs from Telegraf

-

System info

Ubuntu 22.04

Docker

No response

Steps to reproduce

Start telegraf with [inputs.systemd_units] used.

Expected behavior

Same cpu usage as before.

Actual behavior

With the new implementation of systemd_units, every time telegraf runs (which is every 10s), a systemd process eats a lot of cpu.

Attached 2 picture of grafana graphs, I think you can guess when I upgraded to 1.30.1. I've just reverted it back again to 1.29.5-1.

Screenshot from 2024-04-04 16-35-19

Screenshot from 2024-04-04 16-35-41

Additional info

No response

@electrofloat electrofloat added the bug unexpected problem or unintended behavior label Apr 4, 2024
@srebhan
Copy link
Member

srebhan commented Apr 4, 2024

@electrofloat thanks for reporting the issue! During the rewrite we also added some features like reporting disabled/unloaded units and my guess would be that this causes the issue. Could you assist in debugging the issue by testing some PR binaries?

@srebhan srebhan self-assigned this Apr 4, 2024
@srebhan srebhan added the regression something that used to work, but is now broken label Apr 4, 2024
@electrofloat
Copy link
Author

@srebhan I can try, yes.

@srebhan
Copy link
Member

srebhan commented Apr 4, 2024

@electrofloat thanks! Can you please test the binary in PR #15108 available after CI finished the tests?!? Please set the new option loaded_only to true and let me know if this helps!

@electrofloat
Copy link
Author

electrofloat commented Apr 4, 2024

It is better. Instead of about 100%, It only causes the PID1 systemd process to use about 15% cpu every 10 seconds, but v1.29.5-1 uses 0%. (I've just checked)

@srebhan
Copy link
Member

srebhan commented Apr 9, 2024

@electrofloat I added another optimization, can you please test the binary in the PR when CI finished the tests!? Let me know if this reduces the load further...

@electrofloat
Copy link
Author

It seems much better now. Seems the systemd process stays at 0%, with the loaded_only true option. I haven't tested whether it still provides the exact same metrics as before or not, only tested the cpu usage.

Also.. shouldn't the new option default to true? So that it works exactly like before the rewrite, and people don't have to change anything in the config to stay with the previous (before rewrite) behavior?

@srebhan
Copy link
Member

srebhan commented Apr 10, 2024

@electrofloat will discuss it with the team as unfortunately we released 1.30. with the behavior being to also report non-loaded units...

@srebhan
Copy link
Member

srebhan commented Apr 11, 2024

@electrofloat we decided to revert to the pre-v1.30.0 behavior and I adapted the PR accordingly. Could you please test it again, just to be sure!? Please note that the option now is inverted and no longer called loaded_only but collect_disabled_units.

@electrofloat
Copy link
Author

Yes, it seems to be working as expected now, without any config change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior regression something that used to work, but is now broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants