Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric tracking within external plugins #15572

Closed
Wimmelwaldi opened this issue Jun 27, 2024 · 2 comments · Fixed by #15636
Closed

Metric tracking within external plugins #15572

Wimmelwaldi opened this issue Jun 27, 2024 · 2 comments · Fixed by #15636
Labels
feature request Requests for new plugin and for new features to existing plugins waiting for response waiting for response from contributor

Comments

@Wimmelwaldi
Copy link
Contributor

Relevant telegraf.conf

[agent]
  interval = "10s"
  debug = true
  logtarget = "stderr"
  logfile = ""

[[inputs.execd]]
  command = ["/etc/telegraf/extern_plugin/mqtt_consumer", "-config", "/etc/telegraf/extern_plugin/plugin.conf"]
  signal = "none"
  data_format = "influx"
  name_override = "device"
  tags = {device = "default_device"}

[[outputs.influxdb_v2]]
  urls = ["http://influxdb:8086"]
  token = "xxx"
  organization = "yyy"
  bucket = "zzz"
  tagexclude = ["device", "asset"]
  
  data_format = "json"
  
  [outputs.influxdb_v2.tagpass]
    device = ["default_device"]

Logs from Telegraf

telegraf | 2024-06-25T12:36:32Z D! [outputs.influxdb_v2] Wrote batch of 2 metrics in 13.371321ms
telegraf | 2024-06-25T12:36:32Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
telegraf | 2024-06-25T12:36:42Z D! [outputs.influxdb_v2] Wrote batch of 2 metrics in 5.369446ms
telegraf | 2024-06-25T12:36:42Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
telegraf | 2024-06-25T12:36:52Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
telegraf | 2024-06-25T12:37:02Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
telegraf | 2024-06-25T12:37:12Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
telegraf | 2024-06-25T12:37:22Z D! [outputs.influxdb_v2] Buffer fullness: 0 / 10000 metrics
...

System info

Telegraf 1.31, Windows 10

Docker

No response

Steps to reproduce

Probably the easiest way to reproduce this:

  1. Convert the input plugin mqtt_consumer into an external plugin

Expected behavior

Receive feedback when a metric has been successfully written to stdout so that the plugin does not get stuck.

Actual behavior

The maximum amount of processed messages will be equal to max_undelivered_messages because the metrics are never getting acknowledged.

Additional info

I had already created a post about this in the community forum: https://community.influxdata.com/t/metric-tracking-with-external-plugin/34710

I think it is a serious bug that the methods for metric tracking in an external plugin do not work at all, because the accumulator is provided by shim and is processed by it as well. The metrics were simply neglected to be acknowledged under certain circumstances.

Hence my suggestion, to use the two methods m.Drop() and m.Reject() after the call in this line
https://github.com/influxdata/telegraf/blob/v1.31.0/plugins/common/shim/goshim.go#L121

Yes, the functionality of metric tracking when used in an external plugin would then differ somewhat from the usual application, but it would still provide useful functionality under these circumstances.

@Wimmelwaldi Wimmelwaldi added the bug unexpected problem or unintended behavior label Jun 27, 2024
@powersj
Copy link
Contributor

powersj commented Jun 27, 2024

Hi,

I haven't responded to the forums as I need to dig in and understand the implications of making this change to existing users. In general, no one has ever mentioned tracking metrics with the shim before, so it is unlikely it would impact anyone, but we need to ensure that how we acknowledge the metrics and when is done correctly.

Let me talk to folks on Monday and get back to you.

@powersj powersj added feature request Requests for new plugin and for new features to existing plugins and removed bug unexpected problem or unintended behavior labels Jun 27, 2024
@powersj
Copy link
Contributor

powersj commented Jul 8, 2024

Hi,

My apologies for the delay, I got sick last week and then we had some holiday time.

After chatting with the team, I think we agree that it makes sense to make a change to Telegraf to call m.Accept on the metric. This way the metrics are always considered accepted, internal tracking metrics will show a correct count, rather than a large number of dropped metrics, and it ensures that the max_undelivered_messages will correctly decrement as well.

An additional clear statement to the readme would be needed to ensure that users understand that the plugin calls accepted no matter what.

Are you willing to put up a PR to make this change?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins waiting for response waiting for response from contributor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants