Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precomputed Sum aggregation should filter on collection #4217

Closed
MrAlias opened this issue Jun 12, 2023 · 2 comments · Fixed by #4289
Closed

Precomputed Sum aggregation should filter on collection #4217

MrAlias opened this issue Jun 12, 2023 · 2 comments · Fixed by #4289
Assignees
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:SDK Related to an SDK package

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Jun 12, 2023

Description

Using an asynchronous instrument that records two measurements for the same attribute set will correctly only report the last value (the value is interpreted as the pre-computed sum value). However, if an attribute filter is applied to the SDK and it affects the measurements the values are then added.

Environment

  • go.opentelemetry.io/otel: v1.16.0
  • go.opentelemetry.io/otel/sdk/metric: v0.39.0

Steps To Reproduce

Using the following code:

package main

import (
	"context"
	"log"
	"os"

	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/metric"
	sdk "go.opentelemetry.io/otel/sdk/metric"
	"go.opentelemetry.io/otel/sdk/metric/metricdata"
)

func main() {
	l := log.New(os.Stdout, "", 0)
	reader := sdk.NewManualReader()
	meter := sdk.NewMeterProvider(
		sdk.WithReader(reader),
	).Meter("test")

	alice := attribute.NewSet(
		attribute.String("user", "Alice"),
		attribute.Bool("admin", true),
	)
	counter, err := meter.Int64ObservableUpDownCounter(
		"logins",
		metric.WithInt64Callback(func(_ context.Context, o metric.Int64Observer) error {
			o.Observe(1, metric.WithAttributeSet(alice))
			return nil
		}),
	)
	if err != nil {
		l.Fatal(err)
	}

	_, err = meter.RegisterCallback(func(_ context.Context, o metric.Observer) error {
		o.ObserveInt64(counter, 2, metric.WithAttributeSet(alice))
		return nil
	}, counter)
	if err != nil {
		l.Fatal(err)
	}

	var out metricdata.ResourceMetrics
	reader.Collect(context.Background(), &out)
	m := out.ScopeMetrics[0].Metrics[0]
	dpt := m.Data.(metricdata.Sum[int64]).DataPoints[0]
	l.Printf(
		"%s: %d (%s)",
		m.Name,
		dpt.Value,
		dpt.Attributes.Encoded(attribute.DefaultEncoder()),
	)
}

and running ...

$ go run .
logins: 2 (admin=true,user=Alice)

Correctly produces a value of 2 for the full attribute set ((admin=true,user=Alice)).

If a view is added to filter out the attribute where the key is admin:

func main() {
	l := log.New(os.Stdout, "", 0)
	reader := sdk.NewManualReader()
	meter := sdk.NewMeterProvider(
		sdk.WithReader(reader),
		sdk.WithView(sdk.NewView(
			sdk.Instrument{Name: "logins"},
			sdk.Stream{
				AttributeFilter: func(kv attribute.KeyValue) bool {
					return kv.Key == "user"
				},
			},
		)),
	).Meter("test")

	alice := attribute.NewSet(
		attribute.String("user", "Alice"),
		attribute.Bool("admin", true),
	)
	counter, err := meter.Int64ObservableUpDownCounter(
		"logins",
		metric.WithInt64Callback(func(_ context.Context, o metric.Int64Observer) error {
			o.Observe(1, metric.WithAttributeSet(alice))
			return nil
		}),
	)
	if err != nil {
		l.Fatal(err)
	}

	_, err = meter.RegisterCallback(func(_ context.Context, o metric.Observer) error {
		o.ObserveInt64(counter, 2, metric.WithAttributeSet(alice))
		return nil
	}, counter)
	if err != nil {
		l.Fatal(err)
	}

	var out metricdata.ResourceMetrics
	reader.Collect(context.Background(), &out)
	m := out.ScopeMetrics[0].Metrics[0]
	dpt := m.Data.(metricdata.Sum[int64]).DataPoints[0]
	l.Printf(
		"%s: %d (%s)",
		m.Name,
		dpt.Value,
		dpt.Attributes.Encoded(attribute.DefaultEncoder()),
	)
}

and running:

$ go run .
logins: 3 (user=Alice)

Produces a value of 3 for the reduced attribute set ((user=Alice)).

Expected behavior

The second example with a view should produce the same value as the first: 2. E.g.

$ go run .
logins: 2 (user=Alice)

Proposal

This behavior results from different treatment of values measured by the pre-computed sum aggregator:

// Aggregate records value with the unfiltered attributes attr.
//
// If a previous measurement was made for the same attribute set:
//
// - If that measurement's attributes were not filtered, this value overwrite
// that value.
// - If that measurement's attributes were filtered, this value will be
// recorded along side that value.
func (s *precomputedMap[N]) Aggregate(value N, attr attribute.Set) {
s.Lock()
v := s.values[attr]
v.measured = value
s.values[attr] = v
s.Unlock()
}
// aggregateFiltered records value with the filtered attributes attr.
//
// If a previous measurement was made for the same attribute set:
//
// - If that measurement's attributes were not filtered, this value will be
// recorded along side that value.
// - If that measurement's attributes were filtered, this value will be
// added to it.
//
// This method should not be used if attr have not been reduced by an attribute
// filter.
func (s *precomputedMap[N]) aggregateFiltered(value N, attr attribute.Set) { // nolint: unused // Used to agg filtered.
s.Lock()
v := s.values[attr]
v.filtered += value
s.values[attr] = v
s.Unlock()
}

The non-filtered values are treated as pre-computed, but the filtered values are all added to the non-filtered. There is no state tracking for the filtered value.

The correct behavior is to set all values, filtered or non-filtered, as the last value measured and when they are finally collected add all filtered values together and add to the non-filtered value.

The can be done by updating the implementation to not filter on measurement aggregation, but instead filtering on the collection operation.

By filtering during collection, it will produce the correct data, but it should also reduce the existing complexity around the pre-computed sums and filtering.

Design Issues

This means that using an attribute filter to reduce unbounded cardinality of attributes will not be effective against pre-computed sums.

This flaw is going to be a consequence of needing to produce the correct data regardless of the algorithm used.

Alternatives

Alternatively the existing design on filtering on aggregation input can be preserved and the filtered map can track all the raw unfiltered attribute sets that will eventually be filtered.

This approach will produce the same end result, but it will persist or add to the complex filtering logic already associated with the pre-computed sum aggregation.

@MrAlias MrAlias added bug Something isn't working pkg:SDK Related to an SDK package area:metrics Part of OpenTelemetry Metrics labels Jun 12, 2023
@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 6, 2023

From the SIG meeting:

Alternatively, we should verify if the specification requires the "last-value-wins" implementation we have for multiple observations. If it does not, we could instead also just add multiple observations for non-filtered values together. This would preserve the filter-on-input implementation and keep views being useful for limiting cardinality.

@MrAlias
Copy link
Contributor Author

MrAlias commented Jul 6, 2023

From the supplementary guidelines:

When considering this matter, note also that the metrics API has a recommendation for each asynchronous instrument: User code is recommended not to provide more than one Measurement with the same attributes in a single callback.. Consider whether the impact of user error in this regard will impact the correctness of the view. When maintaining per-stream state for the purpose of View correctness, SDK authors may want to consider detecting when the user makes duplicate measurements. Without checking for duplicate measurements, Views may be calculated incorrectly.

This suggests the users shouldn't do this, but it does not specify explicitly how the situation should be handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:SDK Related to an SDK package
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants