Skip to content

test(red-197205): regression test for zero-rate stats with --reconnect-interval#384

Open
fcostaoliveira wants to merge 3 commits into
redis:masterfrom
filipecosta90:test/red-197205-reconnect-interval-rates
Open

test(red-197205): regression test for zero-rate stats with --reconnect-interval#384
fcostaoliveira wants to merge 3 commits into
redis:masterfrom
filipecosta90:test/red-197205-reconnect-interval-rates

Conversation

@fcostaoliveira
Copy link
Copy Markdown
Collaborator

@fcostaoliveira fcostaoliveira commented May 17, 2026

Summary

  • Pins down the RED-197205 invariant: running with `--reconnect-interval=1` must not zero out `Ops/sec`, `KB/sec`, `Hits/sec` or `Misses/sec` in the final JSON summary.
  • The bug itself was already fixed on master by 0228bac (PR Add client staircase ramp-up for incremental connection growth #350Add client staircase ramp-up) as a side-effect of the `m_started` atomic flag added to skip never-started clients in `merge_run_stats()` / `aggregate_inst_histogram()`. RED-197205 was never called out by name in that PR, so this test makes the invariant explicit.

Why this matters

Before PR #350, `client_group::merge_run_stats()` iterated all clients including those that were `prepare()`-d but never reached `set_start_time()`. Their zeroed `m_start_time` was factorial-averaged into the aggregate, pulling it toward the epoch. With `m_end_time` correct but `m_start_time` near zero, `test_duration_usec` became huge and every `ops / test_duration_usec * 1000000` division collapsed to ~0. The latency fields were correct because they're computed independently of `test_duration_usec` (`accumulated_latency / count`), which is exactly the asymmetry the bug report observed.

Verification

Locally reproduced + confirmed the fix using the test's exact workload (`-t 4 -c 4 --test-time=5 --ratio=1:1 --reconnect-interval=1`):

Binary Totals.Ops/sec Totals.KB/sec Totals.Misses/sec Totals.Latency
v2.2.1 0.00 0.00 0.00 0.249
master 52320 3035 26114 0.249

The test would have caught the bug on v2.2.1; master passes cleanly.

Test-shape notes

  • The bug only fires with `--test-time` (not `-n`). With `-n`, every client runs to completion and reaches `set_start_time`, so the zero-state merge path is never hit. `_build_benchmark` gains an optional `test_time` parameter that's mutually exclusive with `requests` (memtier rejects both).
  • Test runtime: ~5 s + harness overhead.

Test plan

  • Test passes on master.
  • Test would have failed on v2.2.1 (`Ops/sec=0.00`).
  • `make format-check-staged` (pre-commit hook) passes.
  • Full RLTest matrix on CI.

🤖 Generated with Claude Code


Note

Low Risk
Low risk: changes are confined to the test suite, adding a new regression test and slightly relaxing latency tolerance for low-count time-series buckets.

Overview
Adds a new RLTest regression case (test_reconnect_interval_does_not_zero_out_rates) to ensure --reconnect-interval=1 does not produce all-zero throughput rate fields (Ops/sec, KB/sec, Misses/sec) in the final JSON summary, while re-validating existing latency consistency checks.

Updates the JSON time-series latency consistency check to account for 1ms quantization error in Accumulated Latency at low counts, and extends _build_benchmark with an optional test_time mode (mutually exclusive with requests) to support time-bounded runs.

Reviewed by Cursor Bugbot for commit b90142c. Bugbot is set up for automated code reviews on this repo. Configure here.

paulorsousa
paulorsousa previously approved these changes May 17, 2026
fcostaoliveira added a commit to filipecosta90/memtier_benchmark that referenced this pull request May 18, 2026
…ncy quantization

CI on PR redis#384 surfaced this: with --reconnect-interval=1 and --test-time=5,
the trailing per-second bucket can hold very few ops (Count as low as 1).
'Accumulated Latency' is emitted with %lld (1 ms precision), so the
reported value can differ from the true sum by up to 1 ms — which means
the derived avg = accumulated/count can differ from the double-precision
'Average Latency' by up to (1 / count) ms.

The previous tolerance was max(0.01 ms, 2% relative). For Count=1 with a
sub-ms avg that's 0.01 ms; observed gap was 0.058 ms (0.391 vs 0.333),
well within the quantization bound but outside the static tolerance.

Add a third term `1.0 / count` so the quantization error is modeled
explicitly. For large-count buckets (the typical case in every existing
test) the existing tolerance still dominates, so no behavior change for
them.

Verified by replaying the exact failing workload locally:
  -t 4 -c 4 --test-time=5 --ratio=1:1 --reconnect-interval=1
  old tolerance: 3 violations across Sets/Gets/Totals bucket 5
  new tolerance: 0 violations

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fcostaoliveira and others added 3 commits May 18, 2026 12:58
…t-interval

RED-197205 reported that running with --reconnect-interval=1 produced a
final summary where Totals.Ops/sec, KB/sec, Hits/sec and Misses/sec all
rendered as 0.00 even though Count and Latency were correct.

Root cause (on v2.2.1): client_group::merge_run_stats() iterated *all*
clients including those that were prepare()-d but never reached
set_start_time(). Their zeroed m_start_time was factorial-averaged into
the aggregate, dragging it toward the epoch. With m_end_time correct
but m_start_time near zero, test_duration_usec became huge and every
`ops / test_duration_usec * 1000000` division collapsed to ~0.

Fix landed in 0228bac (PR redis#350): a run_stats::m_started atomic flag
plus has_started() guards in merge_run_stats() and
aggregate_inst_histogram() that skip never-started clients. RED-197205
was a side-effect fix, not the headline of that PR, so this test pins
down the invariant explicitly.

Verified end-to-end against the test's exact workload:

  v2.2.1 + 4t/4c --test-time=5 --ratio=1:1 --reconnect-interval=1
    -> Ops/sec=0.00, KB/sec=0.00, Misses/sec=0.00  (bug fires)

  master + same args
    -> Ops/sec=52320, KB/sec=3035, Misses/sec=26114  (fixed)

The bug only fires reliably with --test-time (not -n); with -n every
client runs to completion and reaches set_start_time, so the zero-state
merge path is never hit. _build_benchmark gains an optional test_time
parameter so the helper can express that distinction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ncy quantization

CI on PR redis#384 surfaced this: with --reconnect-interval=1 and --test-time=5,
the trailing per-second bucket can hold very few ops (Count as low as 1).
'Accumulated Latency' is emitted with %lld (1 ms precision), so the
reported value can differ from the true sum by up to 1 ms — which means
the derived avg = accumulated/count can differ from the double-precision
'Average Latency' by up to (1 / count) ms.

The previous tolerance was max(0.01 ms, 2% relative). For Count=1 with a
sub-ms avg that's 0.01 ms; observed gap was 0.058 ms (0.391 vs 0.333),
well within the quantization bound but outside the static tolerance.

Add a third term `1.0 / count` so the quantization error is modeled
explicitly. For large-count buckets (the typical case in every existing
test) the existing tolerance still dominates, so no behavior change for
them.

Verified by replaying the exact failing workload locally:
  -t 4 -c 4 --test-time=5 --ratio=1:1 --reconnect-interval=1
  old tolerance: 3 violations across Sets/Gets/Totals bucket 5
  new tolerance: 0 violations

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
memtier rejects --reconnect-interval together with --cluster-mode
("error: cluster mode dose not support reconnect-interval option"),
so the test exited non-zero on every cluster CI job. The RED-197205
regression is not cluster-specific — guard the test with env.isCluster()
so it only runs in single-endpoint configurations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@fcostaoliveira fcostaoliveira force-pushed the test/red-197205-reconnect-interval-rates branch from 930f554 to b90142c Compare May 18, 2026 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants