Skip to content

Commit

Permalink
Receive: fix thanos_receive_write_{timeseries,samples} stats (thano…
Browse files Browse the repository at this point in the history
…s-io#7643)

* Revert "Receive: fix stats (thanos-io#7373)"

This reverts commit 66841fb.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* Receive: fix `thanos_receive_write_{timeseries,samples}` stats

There are two path data can be written to a receiver: through the HTTP
or the gRPC endpoint, and `thanos_receive_write_{timeseries,samples}` only
count the number of timeseries/samples received through the HTTP
endpoint.

So, there is no risk that a sample will be counted twice, once as a
remote write and once as a local write. On the other hand, we still need
to account for the replication factor, and only count local writes is
not enough as there might be no local writes at all (e.g. in RouterOnly
mode).

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
  • Loading branch information
cincinnat authored and jnyi committed Oct 16, 2024
1 parent 51c3df7 commit 3c6c795
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 24 deletions.
2 changes: 1 addition & 1 deletion docs/components/receive.md
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,7 @@ Please see the metric `thanos_receive_forward_delay_seconds` to see if you need

The following formula is used for calculating quorum:

```go mdox-exec="sed -n '990,999p' pkg/receive/handler.go"
```go mdox-exec="sed -n '999,1008p' pkg/receive/handler.go"
func (h *Handler) writeQuorum() int {
// NOTE(GiedriusS): this is here because otherwise RF=2 doesn't make sense as all writes
// would need to succeed all the time. Another way to think about it is when migrating
Expand Down
53 changes: 30 additions & 23 deletions pkg/receive/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -711,33 +711,40 @@ type remoteWriteParams struct {
alreadyReplicated bool
}

func (h *Handler) gatherWriteStats(writes ...map[endpointReplica]map[string]trackedSeries) tenantRequestStats {
var stats tenantRequestStats = make(tenantRequestStats)

for write := range writes {
for er := range write {
for tenant, series := range write[er] {
samples := 0

for _, ts := range series.timeSeries {
samples += len(ts.Samples)
}

if st, ok := stats[tenant]; ok {
st.timeseries += len(series.timeSeries)
st.totalSamples += samples

stats[tenant] = st
} else {
stats[tenant] = requestStats{
timeseries: len(series.timeSeries),
totalSamples: samples,
}
func (h *Handler) gatherWriteStats(rf int, writes ...map[endpointReplica]map[string]trackedSeries) tenantRequestStats {
var stats tenantRequestStats = make(tenantRequestStats)

for _, write := range writes {
for er := range write {
for tenant, series := range write[er] {
samples := 0

for _, ts := range series.timeSeries {
samples += len(ts.Samples)
}

if st, ok := stats[tenant]; ok {
st.timeseries += len(series.timeSeries)
st.totalSamples += samples

stats[tenant] = st
} else {
stats[tenant] = requestStats{
timeseries: len(series.timeSeries),
totalSamples: samples,
}
}
}
}
}

// adjust counters by the replication factor
for tenant, st := range stats {
st.timeseries /= rf
st.totalSamples /= rf
stats[tenant] = st
}

return stats
}

Expand Down Expand Up @@ -768,7 +775,7 @@ func (h *Handler) fanoutForward(ctx context.Context, params remoteWriteParams) (
return stats, err
}

stats = h.gatherWriteStats(localWrites)
stats = h.gatherWriteStats(len(params.replicas), localWrites, remoteWrites)

// Prepare a buffered channel to receive the responses from the local and remote writes. Remote writes will all go
// asynchronously and with this capacity we will never block on writing to the channel.
Expand Down

0 comments on commit 3c6c795

Please sign in to comment.