Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to Prometheus decoders for Prometheus custom metric endpoint parsing #1473

Merged
merged 2 commits into from
Sep 26, 2016

Conversation

jimmidyson
Copy link
Collaborator

Fixes #1472

@fabxc This is a hand-rolled parser to collect application metrics from containers. Forgetting the question or application metrics & cadvisor, I assume the better approach for this would be to use Prometheus' own parser? If so, can you point me to it please?

@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 20, 2016

Jenkins GCE e2e

Build/test failed for commit e3f72e9.

@jimmidyson
Copy link
Collaborator Author

mkdir: cannot create directory ‘/tmp/cadvisor-768’: No space left on device - @timstclair any chance you could sort that out?

@fabxc
Copy link

fabxc commented Sep 20, 2016

For reference, the parser is located in https://github.com/prometheus/common/tree/master/expfmt

@jimmidyson jimmidyson force-pushed the prom-collector-label-spaces branch from e3f72e9 to 80e3f7b Compare September 20, 2016 16:50
@jimmidyson
Copy link
Collaborator Author

@fabxc Would be you be able to take a look if you have some time? Switched to expfmt decoders & seems to work nicely (few tweaks to tests but nothing fundamental changing in functionality).

@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 20, 2016

Jenkins GCE e2e

Build/test failed for commit 80e3f7b.

@timstclair
Copy link
Contributor

@jimmidyson I'm aware of the issue. I'll try to have it sorted out by EOD

@timstclair
Copy link
Contributor

@k8s-bot test this

@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 20, 2016

Jenkins GCE e2e

Build/test failed for commit 80e3f7b.

@jimmidyson jimmidyson force-pushed the prom-collector-label-spaces branch from 80e3f7b to d5c95e3 Compare September 20, 2016 20:14
@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 20, 2016

Jenkins GCE e2e

Build/test passed for commit d5c95e3.

@jimmidyson jimmidyson changed the title Handle spaces in Prometheus labels Switch to Prometheus decoders for Prometheus custom metric endpoint parsing Sep 21, 2016
FloatValue: metVal,
Timestamp: currentTime,
FloatValue: float64(sample.Value),
Timestamp: sample.Timestamp.Time(),
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I entirely understand this. We seem to ignore the labels of the samples entire and just append timestamp/value pairs for the metric name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah I forgot to ask what the purpose of this label was. Considering there are likely to be multiple labels on each metric, what should this be set to? @timstclair?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timstclair @DirectXMan12 What should the value of label be in this case? A concatenation of all Prometheus labels?

Copy link
Contributor

@DirectXMan12 DirectXMan12 Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concatenation of all Prometheus labels?

Yeah, seems like it. That seems... poor. It works ok for one label, but multiple labels could come in any order. If we wanted to make sure this was comparable, we could sort them first. Doesn't look like the label is actually carried through to the Kube summary API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree this is not ideal. Unfortunately this requires an API change to fix properly. Can you file an issue? Sorting + concatenating sounds good for now. Check the allowed symbols on prometheus labels to decide how to concatenate.

Copy link

@fabxc fabxc Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In label values all unicode runes are allowed. In Prometheus itself we hence us \0xff as an invalid rune as a separator for unambiguous concatenation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope what I've done it right... always confused by unicode stuff like this. Added tests too.

"time"

dto "github.com/prometheus/client_model/go"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does dto stand for?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data transfer object. Kinda nasty, but better than go and kinda what we use everywhere else in Prometheus land.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none-the-less, something more expressive might be in order (perhaps rawmodel or rawtypes?)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for _, line := range lines {
if line == "" {
var (
decSamples = make(model.Vector, 0, 50)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How'd you pick 50? What happens if this overflows?

Copy link

@fabxc fabxc Sep 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the initial capacity. The slice is appended to, so it just allocates more if we go beyond.
50 SGTM. It's a number of metrics we can expect from virtually any endpoint, grow allocations beyond that are fine. But we basically save 6 re-allocations for the general case.

Plus, the slice is just reset at the end and re-used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should probably add a comment noting that it's just a guess at a useful initial capacity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@DirectXMan12 DirectXMan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you're at it, the affected functions should probably have proper godoc.

case dto.MetricType_GAUGE:
return v1.MetricGauge
default:
return v1.MetricType(t.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhh..., if people are expecting this to be only one of these set values, this is going to throw them for a loop.
v1.MetricType seems to be an implicit enumeration (since Go doesn't have explicit enumerations :-/), so I really don't think it's a good idea to arbitrarily add extra values to the enum at runtime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types aren't arbitrary, it's just that cadvisor doesn't support the same metric types as Prometheus & there's no way to map e.g. a Prometheus histogram to a cadvisor metric type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure how to handle this nicely... This PR doesn't fix the existing problem, but doesn't make it worse so will raise a separate issue to figure out how to handle it, but assume that to be in a separate PR.

}
if _, ok := collector.metricsSet[name]; collector.metricsSet != nil && !ok {
Copy link
Contributor

@DirectXMan12 DirectXMan12 Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could really use a comment to indicate why it's checking this (e.g. // skip any metrics not on our list to collect). I realize that the old code doesn't have a comment, but we should fix that (I blame myself :-P).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for _, line := range lines {
if line == "" {
var (
decSamples = make(model.Vector, 0, 50)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should probably add a comment noting that it's just a guess at a useful initial capacity

FloatValue: metVal,
Timestamp: currentTime,
FloatValue: float64(sample.Value),
Timestamp: sample.Timestamp.Time(),
}
Copy link
Contributor

@DirectXMan12 DirectXMan12 Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concatenation of all Prometheus labels?

Yeah, seems like it. That seems... poor. It works ok for one label, but multiple labels could come in any order. If we wanted to make sure this was comparable, we could sort them first. Doesn't look like the label is actually carried through to the Kube summary API.

"time"

dto "github.com/prometheus/client_model/go"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none-the-less, something more expressive might be in order (perhaps rawmodel or rawtypes?)

@jimmidyson jimmidyson force-pushed the prom-collector-label-spaces branch from d5c95e3 to 078eed5 Compare September 23, 2016 11:55
@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 23, 2016

Jenkins GCE e2e

Build/test passed for commit 078eed5.

@jimmidyson jimmidyson force-pushed the prom-collector-label-spaces branch from 078eed5 to 923dbc5 Compare September 23, 2016 12:06
@k8s-bot
Copy link
Collaborator

k8s-bot commented Sep 23, 2016

Jenkins GCE e2e

Build/test passed for commit 923dbc5.

@jimmidyson
Copy link
Collaborator Author

Ready for a second round review.

Copy link
Contributor

@DirectXMan12 DirectXMan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this LGTM.

Some of the work that this PR is doing is reversing or working around the work done by the SampleDecoder (e.g. the SampleDecoder converts labels from pairs to a map, and then we convert back to pairs, and the whole "placing samples into a slice and then into a map" is a bit of extra overhead). Unfortunately, just using the Decoder directly requires a decent bit of boilerplate, so the code in this PR seems reasonable.

case rawmodel.MetricType_GAUGE:
return v1.MetricGauge
default:
return v1.MetricType(t.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please file a follow-up issue about issue about this, because people probably won't expect to see any values beyond what's defined in v1 (whether that means defining more types, or an "other" type, or something else)

@jimmidyson
Copy link
Collaborator Author

@timstclair What's left to get this merged?

@timstclair
Copy link
Contributor

LGTM, feel free to merge.

@jimmidyson jimmidyson merged commit 0c6b72d into google:master Sep 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants