Configurable active user series #3153

pstibrany · 2020-09-10T09:40:20Z

What this PR does: This PR makes tracking of active series in the blocks storage configurable (there are ~~two~~ three parameters: how often to purge old series from memory and update metrics, and how old must series be to be considered inactive, and one to enable active series tracking [disabled by default]).

This makes it possible to track active series more precisely even if they are still in the TSDB head, but haven't been appended any samples recently.

~~In the current state of the PR, this is only wired into the blocks engine, and not chunks engine.~~

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

pracucci

Thanks for working on it! I have few concerns:

The new metric adds more cognitive load to users when trying to understand the differences between chunks and blocks storage. Don't have an answer, but if it could work for the chunks storage too would probably easier to understand.
I'm a bit scared of the lock contention in ActiveSeries (because it's an exclusive lock). I'm wondering if we switch the entry timestamp to an atomic, if a RWMutex could be used (and most of the time we would acquire the read because we expect the series to be already in the map).

pracucci · 2020-09-14T12:24:13Z

pkg/ingester/ingester.go

@@ -103,6 +105,8 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
 	f.DurationVar(&cfg.MetadataRetainPeriod, "ingester.metadata-retain-period", 10*time.Minute, "Period at which metadata we have not seen will remain in memory before being deleted.")

 	f.DurationVar(&cfg.RateUpdatePeriod, "ingester.rate-update-period", 15*time.Second, "Period with which to update the per-user ingestion rates.")
+	f.DurationVar(&cfg.ActiveSeriesUpdatePeriod, "ingester.active-series-update-period", 5*time.Minute, "How often to update active series metrics (blocks engine only).")


I would default to 1m, to have more accurate tracking (by default).

pkg/ingester/ingester.go

pkg/storage/tsdb/active_series.go

pkg/ingester/ingester_v2.go

pstibrany · 2020-09-14T14:12:36Z

Thanks for working on it! I have few concerns:

The new metric adds more cognitive load to users when trying to understand the differences between chunks and blocks storage. Don't have an answer, but if it could work for the chunks storage too would probably easier to understand.

I agree. Let's do that. (Reason why I kept chunks out of this PR is basically to first see how well it works in practice.)

I'm a bit scared of the lock contention in ActiveSeries (because it's an exclusive lock). I'm wondering if we switch the entry timestamp to an atomic, if a RWMutex could be used (and most of the time we would acquire the read because we expect the series to be already in the map).

I will write a benchmark to see throughput of both approaches.

pstibrany · 2020-09-14T15:38:29Z

Benchmark showing improvements when using RW lock and atomics to update timestamps.

name                                  old time/op    new time/op    delta
ActiveSeriesTest_single_series/50-4      154ns ± 2%      76ns ± 1%  -50.52%  (p=0.008 n=5+5)
ActiveSeriesTest_single_series/100-4     156ns ± 1%      76ns ± 1%  -51.25%  (p=0.008 n=5+5)
ActiveSeriesTest_single_series/500-4     158ns ± 1%      76ns ± 2%  -52.01%  (p=0.008 n=5+5)

name                                  old alloc/op   new alloc/op   delta
ActiveSeriesTest_single_series/50-4      0.00B          0.00B          ~     (all equal)
ActiveSeriesTest_single_series/100-4     0.00B          0.00B          ~     (all equal)
ActiveSeriesTest_single_series/500-4     0.00B          0.00B          ~     (all equal)

name                                  old allocs/op  new allocs/op  delta
ActiveSeriesTest_single_series/50-4       0.00           0.00          ~     (all equal)
ActiveSeriesTest_single_series/100-4      0.00           0.00          ~     (all equal)
ActiveSeriesTest_single_series/500-4      0.00           0.00          ~     (all equal)

pracucci · 2020-09-14T15:53:10Z

pkg/ingester/active_series.go

+func (s *activeSeriesStripe) updateSeriesTimestamp(now time.Time, series labels.Labels, fp model.Fingerprint, labelsCopy func(labels.Labels) labels.Labels) {
+	nowNanos := now.UnixNano()
+
+	e := s.findEntryForSeries(fp, series)


I think there's a race condition: we get a reference of an activeSeriesEntry and then we do update it. In the meanwhile, the purge could remove it, because the read lock is kept only during findEntryForSeries(). However, the next remote write push will add it back, right? If so, LGTM, but I would like your opinion too.

Well spotted. Should be fixed now. (There was another problem... since atomic integer was not a pointer, but part of "entry", which was stored in a slice as a plain struct, without pointer, it could actually move in memory when slice grows. That's why atomic integers are now pointers).

There was another problem... since atomic integer was not a pointer, but part of "entry", which was stored in a slice as a plain struct, without pointer, it could actually move in memory when slice grows. That's why atomic integers are now pointers

Well spotted!

gouthamve · 2020-09-14T17:24:41Z

The new metric adds more cognitive load to users when trying to understand the differences between chunks and blocks storage. Don't have an answer, but if it could work for the chunks storage too would probably easier to understand.

I'd be okay adding this to chunks store too. How much overhead will this add?

pstibrany · 2020-09-15T08:01:06Z

The new metric adds more cognitive load to users when trying to understand the differences between chunks and blocks storage. Don't have an answer, but if it could work for the chunks storage too would probably easier to understand.

I'd be okay adding this to chunks store too. How much overhead will this add?

By my counting, each activeSeriesEntry needs 40 bytes on its own + copy of the labels. For 1M series that's at least 40 MB, and if we estimate labels to take 200 bytes per series, that would be additional 200 MB.

(In blocks storage, we try to reuse labels copy if added to the ref cache at the same time. In chunks doing the same is not easy.)

CHANGELOG.md

pracucci

Excellent job. I left a couple of nits. As discussed offline, I would add a enabled flag (default to false) to not let every Cortex user paying the cost of this tracking if not required (this is usually required by vendors billing systems).

pracucci · 2020-09-16T07:43:14Z

pkg/ingester/active_series.go

+func (s *activeSeriesStripe) updateSeriesTimestamp(now time.Time, series labels.Labels, fp model.Fingerprint, labelsCopy func(labels.Labels) labels.Labels) {
+	nowNanos := now.UnixNano()
+
+	e := s.findEntryForSeries(fp, series)


There was another problem... since atomic integer was not a pointer, but part of "entry", which was stored in a slice as a plain struct, without pointer, it could actually move in memory when slice grows. That's why atomic integers are now pointers

Well spotted!

pkg/ingester/active_series.go

pkg/ingester/ingester.go

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Active series are exported as metric. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Also use atomic to keep oldest timestamp. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

…s is updated when slice in the stripe map is appended. Avoid race when newly added entry is removed by purger due to timestamp being set to 0. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

…s to be in a loop). Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

…nction. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

pstibrany · 2020-09-16T09:19:49Z

Excellent job. I left a couple of nits. As discussed offline, I would add a enabled flag (default to false) to not let every Cortex user paying the cost of this tracking if not required (this is usually required by vendors billing systems).

Added -ingester.active-series-enabled flag (defaults to false), and switched fingerprinting to use xxHash 64-bit, which is faster than fnv64a (more visible for bigger labels, so benchmark now uses bigger label), but requires extra small allocation:

name                         old time/op    new time/op    delta
ActiveSeries_UpdateSeries-4    1.03µs ± 6%    0.74µs ±17%  -28.25%  (p=0.008 n=5+5)

name                         old alloc/op   new alloc/op   delta
ActiveSeries_UpdateSeries-4      206B ± 0%      253B ±11%  +23.01%  (p=0.008 n=5+5)

name                         old allocs/op  new allocs/op  delta
ActiveSeries_UpdateSeries-4      2.00 ± 0%      3.00 ± 0%  +50.00%  (p=0.008 n=5+5)

pstibrany · 2020-09-16T09:22:37Z

I've also made sure that if -ingester.active-series-enabled is false, then new metric doesn't appear in the metrics at all. Previous version of the chunks-code actually exported 0 value for each user, which is useless. Unfortunately this made PR bigger :( (alternative would be to not register the metric into to metrics registry, but still create all in-memory structures).

pracucci

Good job, still LGTM! Left final nits.

CHANGELOG.md

docs/configuration/config-file-reference.md

pkg/ingester/user_state.go

pkg/ingester/ingester_v2_test.go

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

bboreham · 2020-09-16T12:04:09Z

In the current state of the PR, this is only wired into the blocks engine, and not chunks engine.

Is this still true?

pstibrany · 2020-09-16T12:19:52Z

In the current state of the PR, this is only wired into the blocks engine, and not chunks engine.

Is this still true?

Not anymore. Also there is now a flag to enable this, which is disabled by default.

* Enable active series metrics in the ingester by default This change calculates and exports the `cortex_ingester_active_series` by default. Up to this point, the metric was disabled by default since calculating it consumes some amount of memory. The original PR (#3153) estimated for 1M active series at least 40MB and up to another 200MB depending on our luck reusing labels from the ref cache. We (Grafana) have been running with this setting enabled on our ingesters for some time and the resource usage doesn't appear to be significant. This feature appears to add between 1.2 - 1.6% in memory usage when enabled: ~140MB out of a total of ~10GB of memory used per ingester. The ingesters I measured this on * Have multiple tenants running production workloads * Have about 1.3M active series each * Have about a 10GB working set (as measured by `kubectl top` and exported k8s metrics) Based on this and the utility of the metric itself, I'd like to enable it by default. Screenshots of the pprof heap output attached. Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>

pstibrany requested review from pracucci and gouthamve September 10, 2020 09:40

pull-request-size bot added the size/XL label Sep 10, 2020

pstibrany mentioned this pull request Sep 10, 2020

Expose number of entries in refcache as a metric. #3135

Closed

3 tasks

pracucci reviewed Sep 14, 2020

View reviewed changes

pracucci reviewed Sep 15, 2020

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

pracucci approved these changes Sep 16, 2020

View reviewed changes

pstibrany added 18 commits September 16, 2020 11:15

Implemented structure to keep track of active series.

7d2c6b6

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Track active series per user independently from TSDB head.

9aba385

Active series are exported as metric. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Added checking of active series to the tests.

08ae868

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Added CHANGELOG entry, and updated docs.

e741769

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Move active series to ingester package.

e9991e4

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Modify concurrency benchmark.

b7c41d5

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Benchmark updating single series.

7896c14

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Use RWLock and atomic integers to keep timestamps in entry.

ef6497b

Also use atomic to keep oldest timestamp. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Use pointer to atomic integer to make sure that same in-memory addres…

856ba4a

…s is updated when slice in the stripe map is appended. Avoid race when newly added entry is removed by purger due to timestamp being set to 0. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Fix benchmark.

c9112d6

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Use "active series" for chunks ingester. Address review feedback.

ea79b17

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Fix CHANGELOG entry.

f5585a9

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Fix docs.

2d17c99

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Added flag to enable/disable tracking of active series.

ad97024

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Added test for active series metrics when using chunks ingester.

07d93a8

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Changed default value of ingester.active-series-idle-timeout to 10m.

611fbe4

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Added comment about updating oldestEntryTs, and fixed CAS usage (need…

cc4dc7f

…s to be in a loop). Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Use uint64 instead of model.Fingerprint. Moved fingerprinting to a fu…

0f9eef5

…nction. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

pstibrany added 3 commits September 16, 2020 11:16

Update benchmark to use longer label and value.

149a00a

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Use xxHash64 for fingerprinting series.

d1a000d

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Update CHANGELOG with new option.

2ba09c8

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

pracucci approved these changes Sep 16, 2020

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

docs/configuration/config-file-reference.md Outdated Show resolved Hide resolved

pkg/ingester/user_state.go Outdated Show resolved Hide resolved

pkg/ingester/ingester_v2_test.go Outdated Show resolved Hide resolved

pstibrany added 3 commits September 16, 2020 12:02

Don't register active series metric if not enabled.

f076e1e

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Address review feedback.

8a9553e

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

Update guarantees.

962111e

Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com>

pstibrany merged commit e265199 into cortexproject:master Sep 16, 2020

56quarters mentioned this pull request Jun 3, 2021

Enable active series metrics in the ingester by default #4257

Merged

2 tasks

Configurable active user series #3153

Configurable active user series #3153

Uh oh!

Conversation

pstibrany commented Sep 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

pracucci Sep 14, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pstibrany commented Sep 14, 2020

Uh oh!

pstibrany commented Sep 14, 2020

Uh oh!

pracucci Sep 14, 2020

Choose a reason for hiding this comment

Uh oh!

pstibrany Sep 15, 2020

Choose a reason for hiding this comment

Uh oh!

pracucci Sep 16, 2020

Choose a reason for hiding this comment

Uh oh!

gouthamve commented Sep 14, 2020

Uh oh!

pstibrany commented Sep 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

pracucci Sep 16, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pstibrany commented Sep 16, 2020

Uh oh!

pstibrany commented Sep 16, 2020

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bboreham commented Sep 16, 2020

Uh oh!

pstibrany commented Sep 16, 2020

Uh oh!

Uh oh!

pstibrany commented Sep 10, 2020 •

edited

Loading

pstibrany commented Sep 15, 2020 •

edited

Loading