Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Observer API #634

Closed
wants to merge 9 commits into from
Closed

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Apr 12, 2020

This is a proof-of-concept for #633. The topic has been discussed in open-telemetry/opentelemetry-specification#549 and the 4/9/2020 Metrics SIG meeting.

This PR is not complete, it updates only the metrics API, the mock test SDK, and one test.

The approach here makes use of interfaces with non-exported methods to accomplish its goal. The result is a slight API change for existing observers. The former:

observer, err := meter.RegisterInt64Observer("observer.name", func(value int64, labels ...core.KeyValue) { ... }))

As in:

observer, err := meter.RegisterInt64Observer("observer.name", 
     metric.NewInt64ObserverCallback(func(value int64, labels ...core.KeyValue) { ... })))

Now, the function has to pass through metric.NewInt64ObserverCallback() or metric.NewBatchObserverCallback() to distinguish the kind of callback in use.

@jmacd jmacd added area:metrics Part of OpenTelemetry Metrics prototype Feature to prototype a spec-level decision labels Apr 12, 2020
@jmacd
Copy link
Contributor Author

jmacd commented Apr 12, 2020

@marwan-at-work

@jmacd
Copy link
Contributor Author

jmacd commented Apr 12, 2020

One of the goals, suggested by @bogdandrutu, was that the API should prevent an instrument from being reported in more than one callback, and that the SDK knows which callback to call for a given instrument, e.g., to support variable-size collection intervals on a per-instrument basis. This PR accomplishes that.

Copy link
Contributor

@marwan-at-work marwan-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your work on this 🙏 💯

api/metric/api_test.go Show resolved Hide resolved
)
},
)
obs1 = Must(meter).RegisterInt64Observer("test.observer.int", cb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it certainly accomplishes the task, I think the API for a batch observer is a bit too complex.

If I understand correctly, those are the steps you have to take in order to work with a batched observer:

  1. Instantiate the individual observers as closures (lines 226 && line 227)
  2. Instantiate the batch observer and pass the callback that uses the closures (lines 234:241)
  3. Register each observer individually using the batch observer (line 242)
  4. Make sure you reassign the closure in the result from step 3 (line 242)

Is there any way to simplify that?

One idea:

obs1 := Must(meter).NewInt64BatchObserver("obs1") 
obs2 := Must(meter).NewFloat64BatchObserver("obs2")
// both obs1 and obs2 satisfy some common interface, for type safety as well as making sure no one observer is passed into two different callbacks. 
Must(meter).RegisterBatchObserver(obs1, obs2, func() error {
  // do some combined work
  obs1.Observe(num)
  obs2.Observe(float)
  return nil
})

I have a few more in mind like using a map/slice that is passed into the callback but I'm still not too familiar with the codebase and architecture to design APIs for it 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I don't really like the API that I've created here, mostly because of the circularity you noticed. Your proposal eliminates the circularity but requires 2x the number of constructors (which will rise as we merge OTEPS 93 and 96).

One of the important things in my variant is that the labels only need to be encoded once for a batch of observations, thus the batch callback has a different Result type than the single callback. Your example would look like:

obs1 := Must(meter).NewInt64BatchObserver("obs1") 
obs2 := Must(meter).NewFloat64BatchObserver("obs2")
Must(meter).RegisterBatchObserver(func(result metric.BatchObserverResult) error {
   result.Observe([]core.KeyValue{...},
      obs1.Observation(num)
      obs2.Observation(float)
   }, obs1, obs2)

Others, please voice your opinion!

Copy link
Contributor

@MrAlias MrAlias Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this API is more circular than the Go language itself. If you want to declare a recursive anonymous function you cannot just do the following:

recurse := func() {
    recurse()
}

Instead, similar to the prototyped API, you need to declare the value so the language has context:

var recurse func()
recurse := func() {
    recurse()
}

This, I agree, might seem a bit circular, but it is not unprecedented in the language itself. If we can build a better API to avoid this I'm +1.

As for the 2x constructor problem. I agree that it would not be desirable. Not only will it double the needed constructors, but it seems like it is just adding a distinct metric type instead of unifying multiple other metrics.

I apologize, as these are just opinions about what has already been offered and I don't have any suggested alternatives (at least not yet, I'm still thinking). @marwan-at-work what other ideas did you have for API changes?

Copy link
Contributor Author

@jmacd jmacd Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a nicer example of the current API, something like:

type BatchOfObservers struct {
     obs1 metric.Int64Observer
     obs2 metric.Int64Observer
}

func (b *BatchOfObservers) callback(result metric.BatchObserverResult) {
    result.Observe(..., b.obs1.Observe(1), b.obs2.Observe(2))
}

func NewBatchOfObservers(meter Meter) *BatchOfObservers {
  batch := &BatchOfObservers{}
  callback := metric.NewBatchObserverCallback(batch.callback)
  batch.obs1 = meter.RegisterInt64Observer(..., callback)
  batch.obs2 = meter.RegisterInt64Observer(..., callback)
  return batch
}

This feels pretty normal to me, I've certainly written that pattern above in a number of places recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit confusing that you create two observers with callbacks and the callback will be invoked once. How about making the callback parameter in Register{IntFloat}64Observer optional? (Either by passing nil or replacing it with a With{IntFloat}64Callback option).

Also, do we need to restrict the observer to be either standalone or used in batch? I don't see why it couldn't be both.

m := metric.Must(meter)
// pass no callbacks
obs1 := m.RegisterInt64Observer("foo")
// or obs1 := m.RegisterInt64Observer("foo", nil)
obs2 := m.RegisterFloat64Observer("bar")
// or obs2 := m.RegisterFloat64Observer("bar", nil)
obs3cb := func(result Int64ObserverResult) {…}
obs3 := m.RegisterInt64Observer("quux", metric.WithInt64Callback(obs3cb))
// or obs3 := m.RegisterInt64Observer("quux", obs3cb)
obs4cb := func(result Float64ObserverResult) {…}
obs4 := m.RegisterFloat64Observer("blargh", metric.WithFloat64Callback(obs4cb))
// or obs4 := m.RegisterFloat64Observer("blargh", obs4cb)
obsbatchcb := func(result BatchObserverResult) {
	result.ObserveBatch(…,
		obs1.Observation(42),
		obs2.Observation(13.5),
		obs3.Observation(45),
		obs4.Observation(4.4),
	)
}
// WithBatchCallback probably makes no sense here.
m.RegisterBatchObserver("baz", obsbatchcb)

// with batch of observers struct:

type BatchOfObservers struct {
	obs1 metric.Int64Observer
	obs2 metric.Float64Observer
	obs3 metric.Int64Observer
	obs4 metric.Float64Observer
}

func (b *BatchOfObservers) callback(result BatchObserverResult) {
	result.ObserveBatch(…,
		b.obs1.Observation(42),
		b.obs2.Observation(13.5),
		b.obs3.Observation(45),
		b.obs4.Observation(4.4),
	)
}

func NewBatchOfObservers(m metric.MeterMust) *BatchOfObservers {
	obs3cb := func(result Int64ObserverResult) {…}
	obs4cb := func(result Float64ObserverResult) {…}
	b := &BatchOfObservers{
		obs1: m.RegisterInt64Observer("foo"),
		obs2: m.RegisterFloat64Observer("bar"),
		obs3: m.RegisterInt64Observer("quux", metric.WithInt64Callback(obs3cb)),
		obs4: m.RegisterFloat64Observer("blargh", metric.WithFloat64Callback(obs4cb)),
	}
	m.RegisterBatchObserver("baz", b.callback)
	return b
}

Copy link
Contributor Author

@jmacd jmacd Apr 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making the callback parameter in Register{IntFloat}64Observer optional?

Also, do we need to restrict the observer to be either standalone or used in batch? I don't see why it couldn't be both.

There was a discussion on this topic. @bogdandrutu wanted to ensure that each instrument has precisely one callback associated with it, and that the SDK knows this so that it can vary the collection interval for instruments. I think this answers both of your questions. We'd have to relax that sort of requirement to move toward a design like the one you proposed.

I can be convinced this is too cumbersome, what do others think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each instrument has precisely one callback associated with it

I think this is the source of API complexity, is there no way around it?

Also, I do like the idea of putting your observers in a struct. Can we make an interface around it? Something like:

type BatchObserver interface {
  Observe(metric.BatchObserverResult)
}

If we don't need to register the individual observers, this is all we have to do:

type BatchOfObservers struct {
     obs1 metric.Int64Observer
     obs2 metric.Int64Observer
}

func (b *BatchOfObservers) Observe(result metric.BatchObserverResult) {
    result.Observe(..., b.obs1.Observe(1), b.obs2.Observe(2))
}

func NewBatchOfObservers(meter Meter) *BatchOfObservers {
  batch := &BatchOfObservers{
    obs1: meter.NewInt64Observer(),
    obs2: meter.NewInt64Observer(),
  }
  metric.RegisterBatchObserver(batch)
  return batch
}

And if we do need to register individual Observers, we can maybe add it to the interface:

type BatchObserver interface {
  Observe(metric.BatchObserverResult) <maybe returns something> 
  Register(...)
}

And have the Register method takes/returns whatever Otel needs to ensure correctness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the source of API complexity, is there no way around it?

Things get very complicated with multiple callbacks:

  • Order of execution for the callbacks.
  • How do you deal with duplicate recordings?

One option is that we document in the API that only the last register callback for an instrument is "active" if ensuring that only one callback can be register causes so much trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a pretty good idea of what we want the specification to say about batch observations:

  • There must be a 1:1 association from instrument to callback
  • The SDK must be aware of the above association

For the Golang API, we have options,

  • either a complicated API that ensures the above constraints are met w/o possibility of error,
  • or a less-complicated but error-prone API that could more-easily result in lost observations or duplicate observations.

I don't have a strong preference in this API choice, mainly I want to keep moving. I see a slight preference in this conversation for @marwan-at-work's proposal, which still leaves options:

  • Allow passing nil as the singleton callback, then allow batch observers to Register() the instruments that will be used
  • Let batch observer callbacks implement async instrument constructors, making there two separate forms of instrument constructor.

An example of the second proposal here:

   batchcb := meter.RegisterBatchObserverCallback(func (...) { ... }) // batch
   observer1 := batchcb.RegsiterInt64SumObserver("...")
   observer2 := batchcb.RegsiterInt64UpDownSumObserver("...")

   observer3 := meter.RegisterValueObserver("...", func (...) { ... }) // singleton
}

This is less error prone, but more surface area. Thoughts?

In any case, the specification will have to say what happens when a batch observer tries to make an observation for an instrument that is bound to a different callback. I think it returns/reports an error.

The other approaches (e.g., the one by @krnowak above) seem to leave more of a possibility that instruments are registered w/o any callback. The SDK can detect this, but it won't know when to issue an error about it, since we don't have any API to tell the SDK "now start, you should have all your instruments correctly registered".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat related note: I'm tired of / confused by the distinction between "New" and "Register" for synchronous vs. asynchronous instruments. I'd like to eliminate "Register" and use "New" consistently in the Go API.

@jmacd
Copy link
Contributor Author

jmacd commented Apr 13, 2020

Note: lots of evidence that we need batch observations:
open-telemetry/opentelemetry-go-contrib#9

@jmacd jmacd changed the title Batch Observer API PoC Batch Observer API Apr 24, 2020
@jmacd jmacd removed the prototype Feature to prototype a spec-level decision label Apr 24, 2020
@jmacd
Copy link
Contributor Author

jmacd commented Apr 24, 2020

I've finished the API change initially proposed in this PR. The PR is complete, although we can still debate the API, it's ready for another look.

@jmacd
Copy link
Contributor Author

jmacd commented Apr 30, 2020

See here to motivate this change: https://github.com/open-telemetry/opentelemetry-go-contrib/pull/9/files

@jmacd jmacd requested a review from Aneurysm9 as a code owner May 8, 2020 16:33
@jmacd
Copy link
Contributor Author

jmacd commented May 8, 2020

I'm prototyping the new approach described here: #634 (comment)

You can read the preliminary diffs from this branch to that branch:
https://github.com/jmacd/opentelemetry-go/compare/jmacd/batch_obs...jmacd:jmacd/batch_obs_2?expand=1

@jmacd jmacd closed this May 8, 2020
@jmacd jmacd deleted the jmacd/batch_obs branch May 20, 2020 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants