Skip to content

Remove sum from Summary#117

Merged
brian-brazil merged 2 commits intoprometheus:masterfrom
felixbarny:patch-1
Aug 10, 2016
Merged

Remove sum from Summary#117
brian-brazil merged 2 commits intoprometheus:masterfrom
felixbarny:patch-1

Conversation

@felixbarny
Copy link
Contributor

Just summing all snapshot values does not work with dropwizard metrics, because the values array does not hold all values but only a sample of recent values.

Just summing all snapshot values does not work with
dropwizard metrics, because the values array does not
hold all values but only a sample of recent values.
@brian-brazil
Copy link
Contributor

@dopuskh3

I could have sworn there was a way to get this data.

new MetricFamilySamples.Sample(name, Arrays.asList("quantile"), Arrays.asList("0.99"), snapshot.get99thPercentile() * factor),
new MetricFamilySamples.Sample(name, Arrays.asList("quantile"), Arrays.asList("0.999"), snapshot.get999thPercentile() * factor),
new MetricFamilySamples.Sample(name + "_count", new ArrayList<String>(), new ArrayList<String>(), count),
new MetricFamilySamples.Sample(name + "_sum", new ArrayList<String>(), new ArrayList<String>(), sum * factor)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing sum will make prometheus histogram useless. Don't we have a way to get this value elsewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still have the quantiles, so it just stops us getting the average.

@felixbarny
Copy link
Contributor Author

Unfortunately, there actually isn't because the Timer only contains data from approximately the last 5 minutes.

One (very bad) estimation would be timer.getSnapshot().getMean() * timer.getCount(). But this value is equally incorrect as the mean is only representative for the last 5 minutes.

Do some prometheus query depend on the sum? Which and why?

@dopuskh3
Copy link

My understanding is that it will work with uniform reservoirs (http://metrics.dropwizard.io/3.1.0/manual/core/#uniform-reservoirs) but not with exponentially decaying reservoirs (defaults for histogram and timer in dropwizard). In this case (uniform reservoir), getValues(1.0) should work.

@felixbarny
Copy link
Contributor Author

felixbarny commented May 24, 2016

getValue(1.0) returns the max value (100th percentile). Summing the values returned by getValues() also does not work for the UniformReservoir, because it only holds 1028 representative values. This would work for UniformReservoirs: timer.getSnapshot().getMean() * timer.getCount(). But like you said, the default is the ExponentiallyDecayingReservoir where it is not possible to get the sum.

@brian-brazil
Copy link
Contributor

So shall we go ahead and remove this then?

@felixbarny
Copy link
Contributor Author

I can't say how this would affect prometheus histogram queries if there is no sum but the current sum value is just plain wrong.

@brian-brazil brian-brazil merged commit e9e5395 into prometheus:master Aug 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants