Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming plugins & improve log verbosity #1815

Closed
kostasb opened this issue Sep 27, 2016 · 12 comments · Fixed by #6207
Closed

Naming plugins & improve log verbosity #1815

kostasb opened this issue Sep 27, 2016 · 12 comments · Fixed by #6207
Labels
area/agent feature request Requests for new plugin and for new features to existing plugins
Milestone

Comments

@kostasb
Copy link

kostasb commented Sep 27, 2016

The agent currently throws an error log with the plugin's name in case of a collection interval timeout.

When using multiple input plugins of the same type (e.g. multiple inputs.snmp instances which is a usual case) more verbosity is required to determine which one times out.

https://github.com/influxdata/telegraf/blob/master/agent/agent.go#L174

Since each plugin has different configuration structure they can't all handled the same way for verbosity.

A workaround would be to support a user_comment configuration option for each plugin.

The user_comment message would be thrown at timeout (or by other plugin-level errors as well) to help the user identify the problematic plugin.

If there is no user_comment defined, this section would be omitted.

E.g.

[[inputs.snmp]]
  user_comment = "localhost_polling"
  agents = [ "127.0.0.1:161" ]
  version = 2
  community = "public"

  name = "system"
  [[inputs.snmp.field]]
    name = "hostname"
    oid = ".1.0.0.1.1"
    is_tag = true

ERROR: input snmp: localhost_polling: took longer to collect than collection interval 10s

@sparrc sparrc added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Sep 27, 2016
@sparrc
Copy link
Contributor

sparrc commented Sep 27, 2016

This is actually related to #1778

I think we could solve both issues by providing a way to name plugins, which could be used both for debug information and for routing.

@kostasb
Copy link
Author

kostasb commented Sep 27, 2016

Agreed, naming plugins would be useful for multiple other features as well like routing.

E.g. Instead of "user_comment" there could be a "plugin_alias".

This is a much smaller feature than #1778 (per-plugin routing), however implementing it will significantly help troubleshooting, so it might make sense to consider them separately.

@sparrc
Copy link
Contributor

sparrc commented Sep 27, 2016

agreed, this issue should be implemented as a pre-cursor to #1778

@kostasb kostasb changed the title Improve agent verbosity for plugin timeout Naming plugins & improve log verbosity Sep 27, 2016
@nhaugo nhaugo added this to the 1.2.0 milestone Sep 28, 2016
@sparrc sparrc modified the milestones: Future Milestone, 1.2.0 Dec 5, 2016
@M-M-M-M
Copy link

M-M-M-M commented Mar 9, 2017

The logger should, at least, use the "name_suffix" variable in order to help find which input is failing.

For time being I have

2017-03-09T08:56:40Z E! ERROR: input [inputs.exec] took longer to collect than collection interval (10s)
2017-03-09T08:56:59Z E! ERROR: input [inputs.exec] took longer to collect than collection interval (10s)

As I have several [[inputs.exec]], I am not able to find which one is failing.

As a workaround I have setup under each [[inputs.exec]] different intervals such as:

  • interval = "9s"
  • interval = "11s"
  • interval = "12s"

This is giving me the opportunity to link interval value to the buggy plugin...
But getting a name in the log line would be better for debugging.

Listing the commands might also be useful even if it may be too long. [[inputs.http_response]] is logging the full GET request so on this plugin it is easy to find which one is failing.

@oplehto
Copy link
Contributor

oplehto commented Sep 29, 2017

+1 for having a user-defineable plugin alias name.

We are deploying several input.exec plugins managed by different groups with log monitoring and alerting done via a log analytics system.

Thus it's really becoming a necessity to be able to differentiate between these input plugins so that the groups get just their own plugins' log messages and alerts.

@nikolay-t
Copy link

+1 for having a user-defineable plugin alias name from me as well.

@glinton
Copy link
Contributor

glinton commented Dec 13, 2018

When defining multiple plugins of the same name, do you specify the name_override to override the metric name from the default name?

If so, a quick change to the Name function could be sufficient.

New Name function:

func (r *RunningInput) Name() string {
	if len(r.Config.NameOverride) > 0 {
		return "inputs." + r.Config.NameOverride
	}
	return "inputs." + r.Config.Name
}

Would output the following:

...
2018-12-13T21:17:49Z W! [agent] input "inputs.exec1" did not complete within its interval
2018-12-13T21:17:49Z W! [agent] input "inputs.exec2" did not complete within its interval
2018-12-13T21:17:49Z E! [inputs.exec2]: Error in plugin: exec: signal: killed for command 'sleep 2s'
2018-12-13T21:17:49Z E! [inputs.exec1]: Error in plugin: exec: signal: killed for command 'sleep 2s'
...

With the config:

[[inputs.exec]]
  name_override="exec1"
  timeout="1s"
  command = "sleep 2s"

[[inputs.exec]]
  name_override="exec2"
  timeout="1s"
  command = "sleep 2s"

[[outputs.file]]

Edit
More simple Name function:

func (r *RunningInput) Name() string {
	return "inputs." + r.Config.Name + r.Config.MeasurementSuffix
}

@danielnelson
Copy link
Contributor

That would be pretty good, I like how it would integrate well with the current logging, but it does have a couple weaknesses. I think normally you would want to leave the default measurement name while still being able to change the plugin name. Also when it comes to the internal plugin measurements it would be nice to have it as a separate tag so you can group by all plugins of a type:

internal_gather,input=exec,alias=foo
internal_gather,input=exec,alias=bar

I guess also it would end up breaking some queries to the internal plugin.

@glinton
Copy link
Contributor

glinton commented Dec 13, 2018

Good point, so for the sake of logging, it may be better to create an Alias function on MetricMakers and call that in agent/accumulator.go:114. The logs would then contain the aliased input, but the metric in the database would be as defined in the config.

Edit
This proposal would only solve for the logging use case.

@danielnelson
Copy link
Contributor

I think Accumulator/MetricMaker might be the wrong place as it could only work with inputs/aggregators. What about adding a Alias function/field to all the Running* types?

@danielnelson danielnelson added feature request Requests for new plugin and for new features to existing plugins area/agent area/logging and removed feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Mar 28, 2019
@rbkasat
Copy link

rbkasat commented May 24, 2019

@danielnelson any update in this? This is a very crucial feature for us now.

@danielnelson danielnelson added this to the 1.12.0 milestone May 29, 2019
@danielnelson
Copy link
Contributor

@rbkasat No update, but I think we know approximately how we will address it. I won't be able to complete it for the upcoming 1.11 release but I will add it for 1.12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/agent feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants