Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: wanted: histograms for skewed distributions #10015

Closed
knz opened this issue Oct 16, 2016 · 11 comments · Fixed by #85990
Closed

metrics: wanted: histograms for skewed distributions #10015

knz opened this issue Oct 16, 2016 · 11 comments · Fixed by #85990
Labels
A-monitoring C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@knz
Copy link
Contributor

knz commented Oct 16, 2016

Found in #9259: we'd like to keep histograms for the distribution of session memory sizes, which has a lot of small values but a few very large values. Unfortunately if one creates a histogram with a large maximum value (the maximum SQL memory size, which for the internal executor is MaxInt64 and for normal SQL session can run into gigabytes) two problems occur:

  1. a bug in the current implementation: the server hangs trying to create a very large array to support the histogram values.

  2. precision loss: all the small sizes end up in a single bucket because the range is very large.

Currently in #9259 the problem is worked around by tracking an histogram of the logarithm of the sizes, but it would be nice to have a new histogram type that does this natively.

Jira issue: CRDB-6145

@knz knz added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Oct 16, 2016
@a6802739
Copy link
Contributor

@knz, I could have a see on it now.

@tbg
Copy link
Member

tbg commented Oct 26, 2016

@knz I assume you tried using our metrics histogram type? I'm surprised that you would see 2) because the point of that histogram is precisely spacing buckets relative to the size of the value at which it is centered.
TBH, tracking the logarithm of the size seems pretty sane and going out on a tangent and implementing a new special-cased histogram type seems hardly justified.

@petermattis petermattis added this to the Later milestone Feb 23, 2017
@petermattis petermattis removed this from the Later milestone Oct 5, 2018
@ajwerner
Copy link
Contributor

@knz this feels stale. Can it be closed?

@knz
Copy link
Contributor Author

knz commented May 11, 2021

I feel the issue is still current but the problem is owned by Obs infra now.

@dhartunian what do you think?

@dhartunian
Copy link
Collaborator

@knz yep I'd like us to triage this and see what we can do. I'm be curious to replicate the issue.

@ajwerner
Copy link
Contributor

ajwerner commented May 11, 2021

Ah right so for our histograms, we tend to use a hard-coded config that (as discussed in #64962) use a very large number of buckets near zero and cannot really represent anything above hundreds of seconds and even then, barely. The things with HDR histograms (roughly the only form of distribution supported by prometheus and the likes which can be combined to produce meaningful statistics, see https://prometheus.io/docs/practices/histograms/).

That issue and this one end up being quite similar. HDR histograms and its relatives (it doesn't really matter how you pick the buckets -- it'll be somewhere between logarithmic and linear) will only be able to cover a finite span at a given precision and the precision vs. size is a direct tradeoff. Histograms must be configured up front.

There are other approaches to distribution sketching (often called summaries). Generally these summaries can be more compact and dynamic. The problem with them tends to be that that either don't compose or, more to the point, that they do not have a decomposed storage format supported by the time-series databases we care about. Another knock against summaries is that they tend not to have a valid notion of subtraction (in fact, this feels like a philosophically deep property of the dynamism of these sketches). Prometheus and generally tools like it, given monotonically increasing bucket counts, are able to provide an understanding of the distribution between any two samples; that's a very powerful property. Something has to give; you can be accurate, precise, and compact over a wide and dynamic range with the tradeoff of requiring less precise time range (accumulating t-digests over time windows), or you can have the histogram tradeoffs we have today (which would benefit form better tuning). You can bend the tradeoff curve a bit by using different time windowing schemes to use more space in order to get more precise time windows. See discussion in tdunning/t-digest#127 and some exploration of some of the windowing ideas in https://pkg.go.dev/github.com/ajwerner/tdigest@v0.0.0-20190922175426-9165f388a6b2/windowed#TDigest.

Cockroach internally just stores the percentiles, but those cannot be aggregated across nodes (see #7896). They become even more meaningless in a serverless world.

@knz
Copy link
Contributor Author

knz commented May 11, 2021

Good point about the trade-offs.

In practice though when things are skewed they will have either one or two sub-distributions and we could record the same metrics using 2+ different histograms with different parameters to get precision on different areas.

@knz
Copy link
Contributor Author

knz commented May 11, 2021

The issue at top was also me suggesting to implement a histogram type with a built-in logarithmic function, which would go 70% of the way.

@ajwerner
Copy link
Contributor

Yeah, HDR histograms are already logarithmic. The library we're using is not super flexible to configure, but that's fine. I'd do something like hdrhistogram.New(128, 1<<40 /* 1 TiB max value*/ , 1) which has fewer buckets that our existing latency histograms.

https://play.golang.org/p/txCn3RSj7Zk

@ajwerner
Copy link
Contributor

tbg added a commit to tbg/cockroach that referenced this issue May 11, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
@tbg
Copy link
Member

tbg commented Jun 2, 2022

@lunevalex I'm not working on this.

@tbg tbg removed their assignment Jun 2, 2022
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 11, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 14, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 16, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 19, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 24, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
craig bot pushed a commit that referenced this issue Aug 26, 2022
85990: metric: add prometheus-based histogram r=aadityasondhi a=aadityasondhi

From #81181:

Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves #10015.
Resolves #64962.
Alternative to dhartunian@eac3d06

Release justification: low risk, high benefit changes

86007: kvserver: log traces from replicate queue on errors or slow processing r=andreimatei a=AlexTalks

While we previously had some logging from the replicate queue as a
result of the standard queue logging, this change adds logging to the
replicate queue when there are errors in processing a replica, or when
processing a replica exceeds a 50% of the timeout duration.
When there are errors or the duration threshold is exceeded,
any error messages are logged along with the collected tracing spans
from the operation.

Release note (ops change): Added logging on replicate queue processing
in the presence of errors or when the duration exceeds 50% of the
timeout.

Release justification: Low risk observability change.

86255: upgrades,upgrade,upgrades_test: Added an upgrade for updating invalid column ID in seq's back references r=Xiang-Gu a=Xiang-Gu

commit 1: a small refactoring of existing function
commit 2: added a field to TenantDeps
commit 3: added a cluster version upgrade that attempts to update
invalid column ID in seq's back references due to bugs in prior versions.
See below for details
commit 4: wrote a test for this newly added upgrade

Previously, in version 21.1 and prior, `ADD COLUMN DEFAULT nextval('s')`
will incorrectly store a 0-valued column id in the sequence 's' back reference
`dependedOnBy` because we added this dependency before allocating an
ID to the newly added column. Customers ran into issues when upgrading
to v22.1 so we relaxed the validation logic as a quick short-term fix, as detailed
in #82859. 

Now we want to give this issue a more proper treatment by adding a cluster
upgrade (to v22.2) that attempts to detect such invalid column ID issues and 
update them with the correct column ID. This PR does exactly this.

Fixes: #83124

Release note: None

Release justification: fixed a release blocker that will resolve invalid column ID
appearance in sequence's back referenced, caused by bugs in older binaries.

Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Aaditya Sondhi <aadityas@cockroachlabs.com>
Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
Co-authored-by: Xiang Gu <xiang@cockroachlabs.com>
@craig craig bot closed this as completed in eb82a43 Aug 26, 2022
aadityasondhi pushed a commit to aadityasondhi/cockroach that referenced this issue Aug 26, 2022
Our current histogram is based on `hdrhistogram`. This tends to create
lots of buckets and is inflexible w.r.t the bucket layout. In hindsight,
it was a poor choice of default implementation (I can say that since I
introduced it) and its cost is disincentivizing the introduction of
histograms that would be useful.

This commit introduces a histogram that is based on a completely vanilla
`prometheus.Histogram`. The only reason we need to wrap it is because we
want to export quantiles to CockraochDB's internal timeseries (it does
not support histograms) and this requires maintaining an internal windowed
histogram (on top of the cumulative histogram).

With this done, we can now introduce metrics with any kind of buckets we
want. Helpfully, we introduce two common kinds of buckets, suitable for
IO-type and RPC-type latencies. These are defined in a human-readable
format by explicitly listing out the buckets.

We can move existing metrics to HistogramV2 easily, assuming we are not
concerned with existing prometheus scrapers getting tripped up by the
changes in bucket boundaries. I assume this is not a big deal in
practice as long as it doesn't happen "all the time". In fact, I would
strongly suggest we move all metrics wholesale and remove the
hdrhistogram-based implementation. If this is not acceptable for some
reason, we ought to at least deprecated it.

We also slightly improve the existing `Histogram` code by unifying how
the windowed histograms are ticked and by making explicit where their
quantiles are recorded (this dependency was previously hidden via a
local interface assertion).

Resolves cockroachdb#10015.
Resolves cockroachdb#64962.
Alternative to dhartunian@eac3d06

TODO
- export quantiles (how to do this? Not clear, don't see the code
  laying around in prometheus, might have to hand-roll it but should
  be easy enough)

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-monitoring C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
7 participants