Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus style histogram metrics from statsd input plugin #8572

Open
lahsivjar opened this issue Dec 16, 2020 · 7 comments
Open

Prometheus style histogram metrics from statsd input plugin #8572

lahsivjar opened this issue Dec 16, 2020 · 7 comments
Labels
area/statsd feature request Requests for new plugin and for new features to existing plugins

Comments

@lahsivjar
Copy link

Feature Request

A way to generate Prometheus style histogram metrics from statsd input plugin (something similar to https://github.com/atlassian/gostatsd#timer-histograms-experimental-feature)

Proposal:

Statsd Input plugin receives raw data from clients thus it should be possible to maintain counters for a user-defined set of le-buckets(similar to Prometheus). Le labels can be added as tags.

Current behavior:

It is not possible to generate Prometheus style histogram metrics for statsd Input plugin.

Desired behavior:

Generate Prometheus style histogram metrics for statsd Input plugin

Use case:

This will add flexibility for conversion from statsd to Prometheus type data.

@lahsivjar lahsivjar added the feature request Requests for new plugin and for new features to existing plugins label Dec 16, 2020
@lahsivjar
Copy link
Author

lahsivjar@43ea032

A PoC for the proposal.

@lahsivjar
Copy link
Author

@danielnelson Would be great if you can take a look at it and give some feedback

@lahsivjar
Copy link
Author

lahsivjar commented Dec 23, 2020

There are other options to achieve this for some other input plugins by using aggregators. However, for statsd input the plugin itself does the parsing and aggregation. Because of this, the raw data is lost.

One way to fix this would be to overhaul the statsd input plugin and create a parser for statsd which would generate telegrah metric using the raw statsd data. For aggregations, the end-user can define aggregators to produce the same cumulative effect as the current statsd input plugin.

@danielnelson WDYT?

@philomory
Copy link
Contributor

@lahsivjar Would you be willing to turn your POC into an actual PR (a draft one if you think it isn't ready to be merged as-is)? I think with an actual PR it'd be much more likely to get feedback on it.

Personally, I'd love to see this, whether the we get the histogram-support baked into the statsd plugin alongside all of it's existing built-in aggregation, or if we simply add a statsd parser that can be used with socket_listener to get raw data to pass to ad-hoc aggregators manually, or both.

@lahsivjar
Copy link
Author

@philomory Thanks for the ping, I have been away from this project for quite some time. I have created a PR from the PoC commit to initiate conversation, I hope the approach is not completely outdated by now 🤞

@jacobstr
Copy link

I'm thinking about this feature myself because it makes me wonder how to get accurate percentiles if we were to scale out telegraf horizontally.

E.g. we could run multiple telegraf replicas and each one maintains some p90 of e.g. a latency measurement from statsd. But a p90(of the p90's across n-replicas) is kind of a meaningless value.

But if each replica exposes histogram buckets, then you can do statistically meaningful percentiles - specifically in the case of input: statsd, output: prometheus flows.

@bbkfhq
Copy link

bbkfhq commented Sep 13, 2024

I'd also love this feature. Without this feature as far as I know it's not possible to "aggregate" percentile values from multiple series.

Meaning that, the unique combination of metric field values will produce multiple separate series in Prometheus. Right now you can get a percentile value for each metric but you can't combine them together to get an aggregate value (using the existing percentiles feature of Statsd input).

Example:

api_response_time_ms{endpoint="list_books", server="server1"}

api_response_time_ms{endpoint="add_book", server="server2"}

You'd need the "bucket" data to be able to use histogram_quantile() to get an overall percentile value for api_response_time_ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/statsd feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants