Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: apply selected changes from upstream #28031

Closed
wants to merge 3 commits into from

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Aug 30, 2023

Back in 2018 with PR #15910, we took an external metrics-lib and made it internal.
Since then, we have applied updates, and also some updates have been applied in the source repo. In this PR, I will cherry-pick and adapt some of those changes that make sense.

name            old time/op  new time/op  delta
EWMA-8          58.8ns ±17%  59.0ns ± 1%     ~     (p=0.690 n=5+5)
EWMAParallel-8   217ns ± 7%   151ns ± 2%  -30.53%  (p=0.008 n=5+5)

For meter, the change becomes lock-free, but it slightly changes the semantics too (same way upstream also does it: https://github.com/rcrowley/go-metrics/blob/master/meter.go ) . It is not racy, as in, the golang race detector would not find any flaws. That said, with the locks removed, it is possible that during short intervals, the results are inconsistent. If an update happens while a tick happens, the rate1 ewma may contain a datapoint that has not been added to the rate5 ewma. I don't think it's a problem, and neither does upstream, but still worth mentioning.

@holiman holiman force-pushed the metrics_update branch 2 times, most recently from 49f73f5 to 853ab0c Compare August 30, 2023 14:44
@holiman
Copy link
Contributor Author

holiman commented Aug 31, 2023

The metrics library has some peculiar designs. And after diving into it a bit, I figured out how those peculiar patterns emerged.

Origin

First, start off with a minimal representation of a 'point':

type point struct {
 x, y int
}

And since we want to do updates, and deliver a point readonly to an external caller from time to time,

type updatingPoint struct { 
  p point
  mu sync.Mutex
}

We keep an internal p, and whenever someone requests a Snapshot(), we deliver the internal p, and replace it with a new one that we do updates on.

We already have a mutex for when to deliver/replace a Snapshot(), so we might aswell use the same mu to guard modifications to the internal "readonly" p:

func (u *updatingPoint) setX(int newX){
   mu.Lock()
  defer mu.Unlock()
  u.p.x = newX
}

Change

At some point, someone made the int-types atomic. And the latter mutex-usage could be removed:

type point struct { 
  x atomic.Uint64
  y atomic.Uint64   
}

func (u *updatingPoint) setX(int newX){
  u.p.x.Store(newX)
}

This was not a great change.

  • Now the "readonly" representation contains atomics, just because we use the 'readonly' representation as the interal representation.

A more suitable change, at this point, would have been to redesign it to deliver atomic-free minimal "readonly" snapshot, but instead use an atomic-based internal representation:

// A true readonly point
type point struct {
 x, y int
}

type updatingPoint struct { 
  x atomic.Uint64
  y atomic.Uint64   
}

func (u *updatingPoint) Snapshot(){
    return &point{ x.Load(), y.Load() } 
}

func (u *updatingPoint) setX(int newX){
  p.x.Store(newX)
}

And along the way, when the mu was relaxed to allow updates while Snapshot was executing, the semantics changed ever so slightly so that while it was not "crashy racy", it was "semantically racy". Which I think is ok.

I have made this change to meter in this PR

@holiman
Copy link
Contributor Author

holiman commented Aug 31, 2023

Hm, an even better, clear and logical restructuring is to define it thus:

type MeterSnapshot interface {
	Count() int64
	Rate1() float64
	Rate5() float64
	Rate15() float64
	RateMean() float64
}

// Meters count events to produce exponentially-weighted moving average rates
// at one-, five-, and fifteen-minutes and a mean rate.
type Meter interface {
	Mark(int64)
	Snapshot() MeterSnapshot
	Stop()
}

I have a follow-up PR for this. It removes a lot of code, and a lot of edgecases: in practice, we never read on the metrics, only on the snapshots. But due to the mandate that they must be readable, it makes the synchronization more complex than it needs to be.

@holiman holiman marked this pull request as ready for review August 31, 2023 13:05
@holiman holiman changed the title [WIP] Metrics update metrics: apply selected changes from upstream Aug 31, 2023
ref rcrowley/go-metrics@9073533

Remove locks from the hot-path in meter.go and add parallel benchmark

metrics: simplify meter.go
metrics: refactor meter
@holiman
Copy link
Contributor Author

holiman commented Sep 5, 2023

I would prefer to merge the larger PR, over this one. This is an ok PR, but with the read/write split happening in #28035, it becomes easier to understand the semantics regarding concurrency and locking. So although in total that one is larger, I think piecemeal-wise, it is easier for a reviewer to understand.

@holiman
Copy link
Contributor Author

holiman commented Sep 13, 2023

Closing this in favour of #28035

@holiman holiman closed this Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant