Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Replace metrics with native Prometheus #35

Closed
pingles opened this issue Feb 6, 2018 · 5 comments
Closed

Replace metrics with native Prometheus #35

pingles opened this issue Feb 6, 2018 · 5 comments
Milestone

Comments

@pingles
Copy link
Contributor

pingles commented Feb 6, 2018

Given Prometheus as the standard means for collecting custom metrics within Kubernetes clusters, and the issues around exporting the go-metrics ones, it'd be prudent to overhaul and replace them with native prom equivalents.

I don't think we'd aim for naming parity so probably sensible to just replace according to the naming scheme the Prometheus authors recommend and fold into a major version release.

@idiamond-stripe
Copy link
Contributor

If possible, can we continue to have the option to provide statsd metrics, at least for latency metrics.

It's useful for us to ship these to our upstream metrics provider and do aggregation there, rather than in the application, like either prometheus or go-metrics do. I have a branch that adds native Prometheus metrics and individual statsd metrics for latency. Would it be useful to open a PR for it?

@pingles
Copy link
Contributor Author

pingles commented Jul 5, 2018

@idiamond-stripe yep, that's ok. Sticking to go-metrics with StatsD sounds ok then?

I'd definitely still like to have native Prometheus metrics, and I think I'd probably rename a bunch of them/change what gets captured to be a little more useful.

I'd love a PR to see what you've done so far (and take on to close this :)

@pingles
Copy link
Contributor Author

pingles commented Jul 17, 2018

#114 should help replace a lot of the gRPC client/server boilerplate.

I'd suggest therefore that we integrate #114 and then the focus of the changes in this issue would be around native prometheus metrics for the HTTP handler. Again, maybe there's a way we can just use the built in integrations to provide telemetry rather than recording/reporting within each of the handlers.

The final thing we'd need to instrument would be timings around the AssumeRole operation call.

@pingles
Copy link
Contributor Author

pingles commented Aug 2, 2018

This is addressed in #131- once that's merged we should have equivalent metrics captured.

@pingles
Copy link
Contributor Author

pingles commented Aug 3, 2018

#131 has been merged!

@pingles pingles closed this as completed Aug 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants