A Docker Compose setup that runs a k6 test which ramps virtual users up in steps, holds, then ramps down, while sending all metrics (including errors and timeouts) to InfluxDB. Grafana auto-loads a complete dashboard.
-
index.js— k6 test. Reads all settings from.env.- Sends
GETtoURLwith a cache-buster query to avoid CDN/cache skew. - Ramps VUs up in
STEP_VUSincrements fromSTART_VUStoMAX_VUS, holds each step, then mirrors down and holds again, then to 0. - Emits built-in metrics (latency, RPS, error rate, TTFB, VUs, iterations, data in/out).
- Adds extra counters:
http_5xx,http_4xx,http_504(gateway timeouts),http_status_0(network errors / request timeouts). - Enforces SLO thresholds: p95
< SLO_P95_MS, p99< SLO_P99_MS, error rate< SLO_ERR_RATE.
- Sends
-
docker-compose.yml— starts InfluxDB, Grafana, and runs k6 (streaming to InfluxDB). -
Provisioning — Grafana auto-creates the InfluxDB data source and auto-loads the dashboard:
provisioning/datasources/datasource.ymlprovisioning/dashboards/dashboards.yml
-
Dashboard —
dashboards/k6-final.jsonshows everything you need (no manual edits).
# from the project root
docker compose up --abort-on-container-exit- Grafana: http://localhost:3000 (user:
admin, pass:admin) - Dashboard loads automatically: “k6 — Final Load Test (Errors, Timeouts, Percentiles)”
To change load or thresholds, edit .env, then:
docker compose down
docker compose up --abort-on-container-exitTarget
URL=... # full URL to hit (your QA/prod-like endpoint)
LABEL=... # short label for Grafana legends (e.g., /01/... (QA))
Ramp profile
STEP_VUS=25 # VU step size (e.g., 25 or 50)
START_VUS=25 # first held step (0 allowed)
MAX_VUS=500 # highest step to test
RAMP_UP=30s # time to ramp each step up
HOLD_UP=30s # hold at each step on the way up
RAMP_DOWN=30s # time to ramp each step down
HOLD_DOWN=30s # hold at each step on the way down
SLEEP_SEC=1 # per-VU think time between requests
TIMEOUT=30s # per-request timeout
SLO thresholds (define “no impact”)
SLO_P95_MS=500 # 95% of requests must be < 500 ms
SLO_P99_MS=900 # 99% of requests must be < 900 ms
SLO_ERR_RATE=0.01 # error rate must be < 1%
Capacity rule: the highest held step where all SLOs pass = max concurrent users without impact for that environment.
- Virtual Users (VUs)
- Requests/sec (RPS)
- Error rate %
- 4xx & 5xx per second
- 504 per second (gateway timeouts)
- status=0 per second (network errors / request timeouts)
- Latency p95 / p99 (ms)
- TTFB p95 (ms) (
http_req_waiting) - Latency breakdown (blocked, connecting, TLS, sending, waiting, receiving)
- Status code breakdown (stacked)
- Iterations/sec
- Data in/out (bytes/sec)
- Top-line stats: Max VUs, Avg RPS, Total requests, Total 5xx/504/status=0, Checks pass %
Filters at top: Scenario (default main) and Request Name (uses your LABEL).
Time range: set to Last 15 minutes (or your test duration).
-
A step passes if:
- p95 ≤
SLO_P95_MS - p99 ≤
SLO_P99_MS - error rate <
SLO_ERR_RATE
- p95 ≤
-
When steps start failing (p95/p99 spike; 5xx / 504 / status=0 rise; RPS plateaus), you’ve reached capacity.
-
Report the last passing step as max concurrent users without impact.
- Dashboard empty: ensure the time range covers the run; containers are healthy (
docker ps). - No metrics: verify k6 environment points to Influx (
K6_OUT=influxdb=http://influxdb:8086/k6is set in the compose). - InfluxQL errors: intervals must not be quoted; queries use
time($__interval); derivatives use a duration (e.g.,1s). - Gateway timeouts: you’ll see 504, increased status=0, rising p95/p99, and often RPS flattening.