Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse out namespace #4863

Open
ktff opened this issue Nov 4, 2020 · 4 comments
Open

Parse out namespace #4863

ktff opened this issue Nov 4, 2020 · 4 comments
Labels
needs: approval Needs review & approval before work can begin. needs: requirements Needs a a list of requirements before work can be begin source: prometheus_scrape Anything `prometheus_scrape` source related source: statsd Anything `statsd` source related type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@ktff
Copy link
Contributor

ktff commented Nov 4, 2020

With #4833 all but prometheus and statsd sources set namespace. For those two we can parse out the namespace from the name, but an audit of all official Prometheus exporters is needed to verify that the conventions are being uphold. A similar research should be done for statsd.

We should also watch out for multi word namespaces if they show up in the audit.

These feature should be togglable where the default state depends on how upholded is the convention.

@ktff ktff added type: enhancement A value-adding code change that enhances its existing functionality. source: statsd Anything `statsd` source related source: prometheus labels Nov 4, 2020
@binarylogic
Copy link
Contributor

@ktff yeah, I don't know how we'd detect multi-word namespaces. Do you have any good ideas that would prevent false positives?

@binarylogic binarylogic added needs: approval Needs review & approval before work can begin. needs: requirements Needs a a list of requirements before work can be begin labels Nov 27, 2020
@ktff
Copy link
Contributor Author

ktff commented Nov 28, 2020

I don't know and don't think there is an 100% correct automatic way to detect that, so if they are used the best we can do is to give users the tools to handle it. Something like global option in which such multi word namespaces user could list and which we would parse out as such. That would ensure there are no false positives, just false negatives.

@binarylogic
Copy link
Contributor

That makes sense. I'm also wondering if we could survey the batch of metrics scraped to make better guesses? For example, if multiple metrics share the same multi-word prefix. That can be a follow up enhancement though.

@ktff
Copy link
Contributor Author

ktff commented Nov 28, 2020

There are multiple approaches we can take. The simplest that I see is: let's assume the first word is namespace then in a batch(or after we have observed some number of metrics) if the combination of first word + second word occurs the same number of times as the first word then we can assume the second word is part of the namespace. This also applies for subsequent words. After that point we can still track if the above holds.

The amount of false positives can then be controlled with the amount of metrics that need to be observed before we make the decision.

Although this wont catch two or more namespaces which share first and or subsequent words.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: approval Needs review & approval before work can begin. needs: requirements Needs a a list of requirements before work can be begin source: prometheus_scrape Anything `prometheus_scrape` source related source: statsd Anything `statsd` source related type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

No branches or pull requests

3 participants