Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf should send stats about itself #1348

Closed
lswith opened this issue Jun 8, 2016 · 7 comments · Fixed by #2043
Closed

Telegraf should send stats about itself #1348

lswith opened this issue Jun 8, 2016 · 7 comments · Fixed by #2043
Milestone

Comments

@lswith
Copy link
Contributor

lswith commented Jun 8, 2016

Feature Request

Proposal:

I would like to see telegraf send metrics about it's own internals. This means what it prints to its logs:

  • time take to send to output
  • how many stats sent to output

It would also be nice for telegraf to count the number of errors it received from a particular input and send that as well. i.e. count failures

Use case:

At my company we use telegraf extensively and would like to make sure it is running under the correct load etc. Having telegraf tell us these statistics would be nice so that we can better monitor telegraf's performance.

@sparrc
Copy link
Contributor

sparrc commented Jun 8, 2016

agreed, this would be a good idea 👍

@lswith lswith changed the title Telegraf send stats about itslef Telegraf send stats about itself Jun 10, 2016
@lswith lswith changed the title Telegraf send stats about itself Telegraf should send stats about itself Jun 10, 2016
@stuartbfox
Copy link

Big +1 from me as well

@zp-markusp
Copy link

+1

@toni-moreno
Copy link
Contributor

I also would like to have basic system info from my process ( to check the impact on the host system) something like.

  • % cpu
  • memory
  • open files

And also virtual machine related stats heap/gc/gouroutines. ( you can see as I did in another influx agent).
https://github.com/toni-moreno/snmpcollector/blob/master/pkg/selfmon.go

Thank you very much for this great tool!

@zp-markusp
Copy link

As some checks from telegraf seems to be quite expensive if the system is under heavy load this issue for me is getting more and more important.
I really need to be able to see how long certain check required in the past to worst case disable those checks on mission critical systems.

So for me the most critical metrics would be:

  • total execution time of all checks (being able to alert if execution time > execution interval)
  • execution time of individual checks (being able to identify which checks are causing a high execution time)
  • peak mem usage during execution
  • total cpu usage

regards, Markus

@kostasb
Copy link

kostasb commented Sep 13, 2016

  • ingress number of points per input and total

@johnrengelman
Copy link
Contributor

johnrengelman commented Sep 24, 2016

Other items to report:
Metric buffer usage
Number of dropped metrics
Number of reported metrics

These are necessary when running a telegram relay layer using http_listener

@sparrc sparrc added this to the 1.3.0 milestone Nov 4, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 7, 2016
sparrc added a commit that referenced this issue Nov 8, 2016
sparrc added a commit that referenced this issue Nov 8, 2016
sparrc added a commit that referenced this issue Nov 8, 2016
sparrc added a commit that referenced this issue Nov 8, 2016
sparrc added a commit that referenced this issue Nov 8, 2016
@sparrc sparrc modified the milestones: 1.2.0, 1.3.0 Nov 14, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 15, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 16, 2016
sparrc added a commit that referenced this issue Nov 18, 2016
sparrc added a commit that referenced this issue Nov 18, 2016
sparrc added a commit that referenced this issue Nov 18, 2016
sparrc added a commit that referenced this issue Nov 18, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 2, 2016
sparrc added a commit that referenced this issue Dec 5, 2016
sparrc added a commit that referenced this issue Dec 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants