-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics stdout export pipeline #265
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @robskillington
Descriptor() *Descriptor | ||
// Exporter handles presentation of the checkpoint of aggregate | ||
// metrics. This is the final stage of a metrics export pipeline, | ||
// where metric data are formatted for a specific system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, in general. There are several points where an alternate implementation could swap in. For the example you gave ("last N minutes"), the best solution would be to provide an alternate Aggregator implementation. The Aggregators are called synchronously from the user code, would would allow this kind of customization without needing to replace the Meter or Batcher implementations.
The current set of two Batcher implementations will probably be extended with a more configurable version -- currently we're using fixed policies to decide which kind of aggregator to use, and there are choices that ought to be configurable (especially for measures). The "AggregationSelector" interface is provided to allow a configuration where you can re-use the Batcher but assign different kinds of aggregator. Ultimately, these choices are all inter-related-- certain exporters support certain aggregators and dimensionality by their nature, all of which is controlled by the Batcher but dictated by the Exporter.
Possibly the current Batcher/Exporter model is only useful for relatively simple exporters, like stdout and statsd. For example, I wrote into the SDK specification "Metric exporters that wish to pull metric updates are likely to integrate a controller directly into the exporter itself."
@rghetia @lizthegrey @paivagustavo I am still working through feedback FYI. I will continue later today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses feedback from @rghetia and @lizthegrey; I still have a few more things and some feedback from @paivagustavo to go through, but this is getting close.
Tests should be reviewed just as critically as any other changes in a PR. I agree with @bogdandrutu This PR is extremely difficult to review. By creating smaller and modular PR's, this reduces time between reviews, enables more community members to take part in the code review process(working toward reducing the burden on the few approvers that there are at the moment) and allows for scoped changes (easier to roll back and confirm correct implementation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses the remaining feedback by @paivagustavo. Thanks, these were really good suggestions.
Reviewers, I made one more test-related change. I was unsatisfied to introduce a metric SDK dependency on "github.com/benbjohnson/clock". It's a good library for testing with a mock clock (see |
@ccaraman I am sympathetic to your and Bogdan's complaint. I did not mean to imply that the tests should not be reviewed -- my impression was that Bogdan (who removed himself as CODEOWNER) was interested in reviewing the design and architecture of this code, not the tests. The reviewers have definitely been poking around in the tests. I also stand by my earlier statements. I've been working on this metrics projects for months now, and I've consistently had trouble getting reviewers to review my incremental progress. First there were a number of OTEPs, and I had trouble getting timely reviews. The picture wasn't clear, so I charged ahead with a specification. The specification made the big picture a lot clearer, and the specification was accepted. There were still unaccepted OTEPs--and I'm still having trouble getting those reviewed--so I proceeded with an implementation, and the implementation has helped get those OTEPs reviewed, because code can be a lot clearer than text. I continued to have trouble getting reviews for incremental progress, such as #172, #253, and #282, and my impression was that reviewers were not comfortable reviewing an incomplete SDK. The big picture was not clear. So again, I charged ahead, and this PR was the result. My current understanding of the problem is that metrics export is extremely complicated, the Meter, the Batcher, the Aggregators, and the Exporter are all related to each other, in various ways, depending on how they are going to be used. A Prometheus exporter is wildly different from a statsd exporter, and both are significantly different from what OpenCensus did or what OpenTelemetry wants. This also feels like a double standard. When we began working on the tracing SDK we merged a bunch of code, without tests, because it helped kickstart development, and we revised and added tests incrementally. The metrics SDK is significantly more complicated than the trace SDK, and a lot changed as I wrote these tests, but I still see this as a prototype. We need to continue working on the specification and gaining experience with more sophisticated Batcher implementations. I believe delaying this PR in order to split it up and review incrementally will delay us until January, and I don't think it's worth the trouble. Still, I agree that this PR is unfortunately large. Some of the comments by @paivagustavo made me realize that it's too big for me at this point (e.g., pointing out two places where an Thanks to the reviewers @paivagustavo @rghetia @lizthegrey @krnowak @robskillington! I left unresolved conversations where I think a good question has been raised that can't be directly addressed in this PR--they will continue to be good discussion topics. I think we should merge this and continue reviewing and revising this code (and its tests) as we begin to review the pending Prometheus and Dogstatsd exporter PRs. |
This completes an end-to-end metrics export pipeline.
This introduces new interfaces in the
sdk/export/metric
package and new package-level documentation insdk/metric/doc.go
.Two batcher implementations are provided,
sdk/metric/batcher/defaultkeys
andsdk/metric/batcher/ungrouped
, both of which may be stateful (e.g., report sums) or stateless (e.g., report deltas). Thedefaultkeys
batcher groups metrics by their recommended keys. Theungrouped
batcher groups metrics at full dimensionality.Aggregator interfaces are provided,
sdk/metric/aggregator
to abstract the concrete aggregator types, with interfaces likeSum
,LastValue
,Distribution
,MaxSumCount
. Currently we have one implementation for counter and gauge, and three choices for measure-kind metrics. Three standard "simple" selectors are provided insdk/metric/selector/simple
, which implement three policies for measure metrics (maxsumcount, ddsketch, array aggregators).A new
sdk/metric/controller/push
package provides the glue to assemble a push-oriented metrics export pipeline. Controllers are responsible for deciding when to initiate collection, calling the exporter, and flushing at exit. The controller implementsmetric.Provider
.A new stdout exporter,
exporter/metric/stdout
formats metric updates (either stateful or stateless, depending on the configured batcher).The
example/basic
main function has been modified to demonstrate how this all comes together. Note that we are squarely looking at open-telemetry/oteps#45, note that you can't get the global meter provider until it's been set or else you get no-op metric instruments.