Generate synthetic PCP v3 archives from declarative YAML workload
profiles — no running pmcd, no root access, no real hardware required.
Designed for testing PCP-based monitoring pipelines, reproducing incidents at arbitrary historical timestamps, and generating load-shaped datasets for analysis tooling.
Prerequisites: PCP (python3-pcp / cpmapi) is a hard dependency — it is
required at import time for metric type and unit constants, not just for archive
writing. All tests (unit, integration, and E2E) need PCP's Python bindings available.
Linux (Debian/Ubuntu):
sudo apt-get install pcp python3-pcp
pip install pcp-pmlogsynthLinux (RHEL/Fedora):
sudo dnf install pcp python3-pcp
pip install pcp-pmlogsynthmacOS (Homebrew): PCP compiles its Python bindings against Homebrew's Python.
The bindings are only available to that specific Python — if you run pmlogsynth
or its tests from a different Python (system, pyenv, conda), import cpmapi will fail.
Use the provided setup script, which handles venv creation correctly on all platforms:
./setup-venv.sh
source .venv/bin/activateFrom source:
git clone https://github.com/tallpsmith/pmlogsynth
cd pmlogsynth
./setup-venv.sh
source .venv/bin/activatedocs/spike.yml is ready to use — or write your own:
# docs/spike.yml
meta:
hostname: demo-host
timezone: UTC
duration: 10m
interval: 60
host:
profile: generic-small
phases:
- name: baseline
duration: 5m
cpu:
utilization: 0.15
- name: spike
duration: 5m
cpu:
utilization: 0.90pmlogsynth --validate docs/spike.yml
# Exit 0 = valid, Exit 1 = error (stderr shows what's wrong)Note:
repeat: dailycannot be combined with other phases — validation will reject it.
pmlogsynth -o ./generated-archives/spike docs/spike.yml
# Creates: generated-archives/spike.0 spike.index spike.metaNote:
generated-archives/is gitignored — a safe scratch space for locally generated archives.
pmlogcheck ./generated-archives/spike
pmval -a ./generated-archives/spike kernel.all.cpu.user
pmrep -a ./generated-archives/spike -o csv kernel.all.cpu.user mem.util.usedpmlogsynth --list-profiles # show hardware profiles
pmlogsynth --list-metrics # show all producible PCP metrics
pmlogsynth --show-schema # dump the full profile schema (for AI agents)If you're using Claude Code with this repo checked out, two built-in skills can generate YAML profiles from plain-English descriptions, validate them, and generate the actual PCP archives — all in one step:
Single-host workload profiles — just describe the scenario:
> simulate a 24-hour web server with overnight quiet, morning ramp, and daytime peak
> create a 1-hour archive of a memory-constrained host under heavy disk I/O
> take docs/spike.yml and add memory pressure during the spike phase
Fleet profiles (multiple hosts with bad actors) — describe the fleet:
> generate a fleet of 20 web servers where 3 have CPU saturation problems
> I need a 50-host database cluster on memory-optimized hardware with some hosts
showing memory pressure and disk thrashing
> create a small 5-host dev cluster with normal web traffic for an hour
The skills bundle the full schema as context, validate the output, and run
pmlogsynth to generate the PCP archives — ready to inspect with pmstat,
pmval, or pmrep. All output goes to generated-archives/.
| Name | CPUs | RAM | Disks | NICs |
|---|---|---|---|---|
generic-small |
2 | 8 GB | 1× NVMe | 1× 10GbE |
generic-medium |
4 | 16 GB | 1× NVMe | 1× 10GbE |
generic-large |
8 | 32 GB | 2× NVMe | 1× 10GbE |
generic-xlarge |
16 | 64 GB | 2× NVMe | 2× 10GbE |
compute-optimized |
8 | 16 GB | 1× NVMe | 1× 10GbE |
memory-optimized |
4 | 64 GB | 1× NVMe | 1× 10GbE |
storage-optimized |
4 | 16 GB | 4× HDD | 1× 10GbE |
Use host.profile: <name> in your profile, or add your own profiles to
~/.pcp/pmlogsynth/profiles/.
Full YAML schema documentation — all fields, types, defaults, valid ranges,
and constraints — is in docs/profile-format.md.
A complete, ready-to-run example covering all four stressor domains is in
docs/complete-example.yml.
The meta.start field accepts a relative offset in addition to absolute
ISO 8601 timestamps. A relative offset is a PCP interval string prefixed with
-, resolved against the clock at invocation time:
meta:
start: -90m # 90 minutes ago
start: -2h # 2 hours ago
start: -1h30m # 1 hour 30 minutes ago
start: -3d # 3 days ago
start: -2days # same — PCP interval strings acceptedThis is useful for replaying realistic-looking archives anchored to "now" —
for example, a simulated spike that started an hour ago. Positive offsets
(+30m) and bare - are rejected with a descriptive error.
Generate a fleet of PCP archives — one per host — from a single self-contained fleet profile. All workload definitions are inline — no external files needed. Each host gets per-host stressor jitter for realistic variation across the fleet.
# Preview host assignments without generating archives
pmlogsynth fleet --dry-run fleet-profile.yml
# Generate a 20-host fleet with deterministic assignment
pmlogsynth fleet --seed 42 -o ./generated-archives/my-fleet fleet-profile.ymlThe output directory contains one PCP archive per host plus a fleet.manifest
YAML file recording hostnames, roles, jitter factors, and the seed.
See docs/profile-format.md for
the full fleet YAML schema.
71 PCP metrics — pmlogsynth --list-metrics or man pmlogsynth.
Full CLI reference — man pmlogsynth.
See CONTRIBUTING.md for dev setup, test structure, and PR conventions.
