Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the histogram bucket bridge #3937

Merged
merged 16 commits into from
Aug 23, 2019
Prev Previous commit
Next Next commit
Handle arbitrarily infinity buckets
  • Loading branch information
mfpierre committed Aug 16, 2019
commit df2a290c0c3cfee756de5ad4c5dc1232f59e713b
23 changes: 17 additions & 6 deletions pkg/aggregator/check_sampler.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
package aggregator

import (
"math"
"time"

"github.com/DataDog/datadog-agent/pkg/aggregator/ckey"
Expand Down Expand Up @@ -64,6 +65,14 @@ func (cs *CheckSampler) newSketchSeries(ck ckey.ContextKey, points []metrics.Ske
}

func (cs *CheckSampler) addBucket(bucket *metrics.HistogramBucket) {
if bucket.Value < 0 {
log.Warnf("Negative bucket value %d for metric %s discarding", bucket.Value, bucket.Name)
return
}
if bucket.Value == 0 {
// noop
return
}
contextKey := cs.contextResolver.trackContext(bucket, bucket.Timestamp)

// if we already saw the bucket we only send the delta
Expand All @@ -77,6 +86,10 @@ func (cs *CheckSampler) addBucket(bucket *metrics.HistogramBucket) {
cs.lastSeenBucket[contextKey] = time.Now()

// simple linear interpolation, TODO: optimize
if math.IsInf(bucket.UpperBound, 1) {
// Arbitrarily double the lower bucket value for interpolation over infinity bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely going to require some documentation, is there any guideline or standard practice around this in the community?

Copy link
Contributor Author

@mfpierre mfpierre Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to find any guideline except what is currently done in the histogram_quantile function that requires the infinity bucket but will return the closest bucket if the quantile falls into it:

if the quantile falls into the highest bucket, the upper bound of the 2nd highest bucket is returned

ref https://github.com/prometheus/prometheus/blob/41151ca8dc069448515f48893b8631b9a3ad8df8/promql/quantile.go#L49-L70

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. That's confusing and unfortunate, but at least it's deterministic. I'll never understand why they don't just add a max metric. But anyway, for now we should mimc this I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbarciauskas I'm not sure how to emulate this behavior with the sketch, don't try to interpolate anything with the infinity bucket? (would mess up the summary)
Or interpolate everything close/at the value of "the upper bound of the 2nd highest" ? (close to what I'm currently doing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean report the count of values in the infinity bucket as the last defined bucket value, essentially as if they were all = max. Either choice (2nd to last bucket = max or 2*2nd to last bucket = max) is synthesizing data that doesn't exist. The first one means that query results should be similar at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just changed the logic to insert everything at the lower_bound of the infinity bucket

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing the lower_bound makes sense to me, but this needs to be very loudly documented since max is ~always going to be wrong in fairly surprising ways.

bucket.UpperBound = bucket.LowerBound * 2
}
bucketRange := bucket.UpperBound - bucket.LowerBound
if bucketRange < 0 {
mfpierre marked this conversation as resolved.
Show resolved Hide resolved
log.Warnf(
Expand All @@ -85,18 +98,16 @@ func (cs *CheckSampler) addBucket(bucket *metrics.HistogramBucket) {
)
return
}
if bucket.Value < 0 {
log.Warnf("Negative bucket value %d for metric %s discarding", bucket.Value, bucket.Name)
return
}
linearIncr := bucketRange / float64(bucket.Value)
currentVal := bucket.LowerBound
log.Tracef(
"Interpolating %d buckets over [%f-%f] with %f increment",
bucket.Value, bucket.LowerBound, bucket.UpperBound, linearIncr,
)
for i := 0; i < bucket.Value; i++ {
cs.sketchMap.insert(int64(bucket.Timestamp), contextKey, currentVal)
currentVal += linearIncr
}

log.Errorf("Adding bucket %v", bucket)
}

func (cs *CheckSampler) commitSeries(timestamp float64) {
Expand Down
71 changes: 71 additions & 0 deletions pkg/aggregator/check_sampler_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ import (
// stdlib
"sort"
"testing"
"time"

// 3p
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"github.com/DataDog/datadog-agent/pkg/metrics"
"github.com/DataDog/datadog-agent/pkg/quantile"
)

func TestCheckGaugeSampling(t *testing.T) {
Expand Down Expand Up @@ -191,3 +193,72 @@ func TestHistogramIntervalSampling(t *testing.T) {

assert.True(t, foundCount)
}

func TestCheckHistogramBucketSampling(t *testing.T) {
checkSampler := newCheckSampler()
checkSampler.bucketExpiry = 10 * time.Millisecond

bucket1 := &metrics.HistogramBucket{
Name: "my.histogram",
Value: 4.0,
LowerBound: 10.0,
UpperBound: 20.0,
Tags: []string{"foo", "bar"},
Timestamp: 12345.0,
}
checkSampler.addBucket(bucket1)
assert.Equal(t, len(checkSampler.lastBucketValue), 1)
assert.Equal(t, len(checkSampler.lastSeenBucket), 1)

checkSampler.commit(12349.0)
_, flushed := checkSampler.flush()

expSketch := &quantile.Sketch{}
// linear interpolated values
expSketch.Insert(quantile.Default(), 10.0, 12.5, 15.0, 17.5)

assert.Equal(t, 1, len(flushed))
metrics.AssertSketchSeriesEqual(t, metrics.SketchSeries{
Name: "my.histogram",
Tags: []string{"foo", "bar"},
Points: []metrics.SketchPoint{
{Ts: 12345.0, Sketch: expSketch},
},
ContextKey: generateContextKey(bucket1),
}, flushed[0])

bucket2 := &metrics.HistogramBucket{
Name: "my.histogram",
Value: 6.0,
LowerBound: 10.0,
UpperBound: 20.0,
Tags: []string{"foo", "bar"},
Timestamp: 12400.0,
}
checkSampler.addBucket(bucket2)
assert.Equal(t, len(checkSampler.lastBucketValue), 1)
assert.Equal(t, len(checkSampler.lastSeenBucket), 1)

checkSampler.commit(12401.0)
_, flushed = checkSampler.flush()

expSketch = &quantile.Sketch{}
// linear interpolated values (only 2 since we stored the delta)
expSketch.Insert(quantile.Default(), 10.0, 15.0)

assert.Equal(t, 1, len(flushed))
metrics.AssertSketchSeriesEqual(t, metrics.SketchSeries{
Name: "my.histogram",
Tags: []string{"foo", "bar"},
Points: []metrics.SketchPoint{
{Ts: 12400.0, Sketch: expSketch},
},
ContextKey: generateContextKey(bucket1),
}, flushed[0])

// garbage collection
time.Sleep(11 * time.Millisecond)
checkSampler.flush()
assert.Equal(t, len(checkSampler.lastBucketValue), 0)
assert.Equal(t, len(checkSampler.lastSeenBucket), 0)
}