A Rust/pgrx PostgreSQL 17 extension that observes, predicts, and optionally enforces WAL-generation budgets by policy scope.
Hooks (executor + utility) and shared-memory state are installed from _PG_init. SQL functions live in the pwb schema. MVP status: observe / shadow / reject / queue modes, durable policies, scope classification by tenant → role → database → application_name, optional composite scopes, and exact backend-local WAL telemetry where PostgreSQL exposes pgWalUsage. Unsupported targets mark actual WAL telemetry unavailable. No cross-node coordination or replica-lag throttling yet.
pg_wal_budget requires preloading; CREATE EXTENSION alone only installs SQL objects.
# postgresql.conf
shared_preload_libraries = 'pg_wal_budget'Restart PostgreSQL, then:
create extension pg_wal_budget;
select pwb.preload_status(); -- expect: preloadedFor local pgrx development:
cargo pgrx run pg17 --postgresql-conf shared_preload_libraries=pg_wal_budgetCreate an observe-mode policy and inspect telemetry:
select pwb.create_policy(
scope_kind => 'role',
scope_value => current_user,
wal_rate_bytes_per_sec => 1048576,
wal_burst_bytes => 8388608,
mode => 'observe',
priority => 100
);
select * from pwb.counters();
select * from pwb.scope_stats();
select * from pwb.recent_decisions(100) order by timestamp_epoch_ms desc;Promote a policy through the rollout stages once predictions look stable. Use queue when a backend should wait briefly for budget instead of returning an immediate error:
select pwb.set_policy_mode(1, 'shadow');
select pwb.update_policy(1, 2097152, 16777216);
select pwb.set_policy_mode(1, 'queue');
select pwb.set_policy_mode(1, 'reject');
select pwb.disable_policy(1);- Modes:
off,observe,shadow,queue,reject. queueblocks the backend before execution until enough WAL budget can be charged. If the predicted WAL bytes exceed the policy burst, the statement is rejected because it can never fit in the bucket.queueis best-effort throttling, not FIFO scheduling. Waiting backends do not reserve future budget; they wake independently and race to charge, so frequent smaller statements can delay larger queued statements.- Queue waits sleep in bounded chunks and check PostgreSQL interrupts between chunks, so query cancellation is observed before the statement starts with up to roughly 100 ms of sleep latency.
- For interactive workloads, size
wal_burst_bytesto cover the largest statement that should be allowed to wait and setwal_rate_bytes_per_sechigh enough for acceptable waits. Usereject, narrower scopes, or separate role/application policies when strict latency or fairness matters. - Scope kinds:
tenant,role,database,application,composite. - Matching: highest
prioritywins; ties resolved by lowestpolicy_id. - Tenant scope is trusted backend-local state; set via
pwb.set_tenant(...)/pwb.clear_tenant(). Restricted to superusers and members ofpwb_tenant_setter. - Composite scopes are opt-in with
pwb.composite_scope_enabled. Their canonical value is ordered astenant=...|role=...|database=...|application=...with unavailable components omitted.
The extension install creates three operational roles:
| Role | Purpose |
|---|---|
pwb_admin |
Manage policies, reset shared-memory stats/profiles, and use trusted tenant setters. |
pwb_monitor |
Read policies and operational telemetry. |
pwb_tenant_setter |
Set or clear trusted backend-local tenant scope. |
pwb.version() and pwb.preload_status() remain public. Policy mutation, detailed telemetry, and reset functions are revoked from public; grant the roles above to operational users instead of granting direct table access. If an install finds an existing unmarked pwb_admin, pwb_monitor, or pwb_tenant_setter role, it fails rather than adopting that role implicitly.
Runtime GUCs (SIGHUP):
| Setting | Default | Purpose |
|---|---|---|
pwb.enabled |
on |
Enable admission hooks and accounting. |
pwb.fail_open |
on |
Allow on internal classification/prediction/accounting failure. |
pwb.default_write_wal_bytes |
16kB |
Fallback prediction for writes. |
pwb.default_utility_wal_bytes |
1MB |
Fallback prediction for utility / COPY. |
pwb.max_prediction_bytes |
1GB |
Upper bound on predictions. |
pwb.profile_ewma_alpha |
0.5 |
EWMA smoothing factor for learned query WAL profiles. |
pwb.composite_scope_enabled |
off |
Classify statements by canonical composite scope when at least two scope components are available. |
pwb.predictor |
profile_ewma |
Predictor strategy: profile_ewma or statement_class_fallback. |
Lower pwb.profile_ewma_alpha values smooth predictions over more history; higher values react faster to recent WAL observations.
Postmaster GUCs (restart required, sized into shared memory):
| Setting | Default | Purpose |
|---|---|---|
pwb.shmem_capacity |
4096 |
Legacy default capacity for shared-memory arrays. Changes require restart. |
pwb.recent_decision_capacity |
-1 |
Capacity for the recent-decision ring; -1 inherits pwb.shmem_capacity. Changes require restart. |
pwb.profile_cache_capacity |
-1 |
Capacity for the shared-memory query profile cache; -1 inherits pwb.shmem_capacity. Changes require restart. |
pwb.budget_bucket_capacity |
-1 |
Capacity for enforcement budget buckets; -1 inherits pwb.shmem_capacity. Changes require restart. |
Emergency disable:
alter system set pwb.enabled = off;
select pg_reload_conf();To fully unload hooks and shared memory, remove pg_wal_budget from shared_preload_libraries and restart.
Version 0.2.1 updates the durable pwb.policy mode constraint so upgraded 0.2.0
installations can store queue policies. Upgrade with:
alter extension pg_wal_budget update to '0.2.1';Before rolling back to older code or older extension SQL, change all queue policies to a legacy
mode such as reject, shadow, observe, or off, then verify none remain:
update pwb.policy set mode = 'reject' where mode = 'queue';
select count(*) from pwb.policy where mode = 'queue';select * from pwb.counters();
select * from pwb.scope_stats();
select * from pwb.query_profiles();
select * from pwb.recent_decisions(100);
select pwb.reset_stats(); -- superuser or pwb_admin
select pwb.reset_profiles(); -- superuser or pwb_adminRecent decisions expose query hashes, query IDs, and workload classifications; treat as operational telemetry and restrict access in production.
- Exact per-backend WAL measurements are used for query profile updates and budget refund/debt reconciliation.
- If the target PostgreSQL/pgrx binding does not expose backend-local WAL usage, actual WAL measurement is unavailable and is not used to update profiles, refund, or charge enforcement buckets.
- Query profiles only update when an exact backend WAL measurement is available; fallback predictions stay important on unsupported targets.
- Shared-memory state is disposable and resets on PostgreSQL restart.
- Reject mode can produce false positives until predictions and fallback GUCs are tuned.
- Managed PostgreSQL providers often disallow custom native extensions and
shared_preload_libraries.
cargo check --no-default-features --features 'pg17 pg_test'
cargo pgrx regress pg17 --resetdb \
--postgresql-conf shared_preload_libraries=pg_wal_budget \
--postgresql-conf compute_query_id=on
cargo fmt --allRun repeatable local workloads before enabling reject mode for real traffic:
cargo pgrx run pg17 --postgresql-conf shared_preload_libraries=pg_wal_budget
\i tests/workloads/calibration_summary.sqlThe workload scripts report predicted WAL, actual WAL, absolute error, and error ratio for insert-heavy, HOT update, indexed update, wide-row update, COPY, and CREATE INDEX paths. Use those results to tune pwb.default_write_wal_bytes and pwb.default_utility_wal_bytes; do not treat first-run defaults as safe reject-mode thresholds.