Race condition in pool_available metric causes negative values during network instability

## Problem Description

The `pgrst_db_pool_available` metric can drift into negative values during network instability or connection pool disruptions. We observed values as low as **-233** with `pgrst_db_pool_max: 80`, which is clearly incorrect.

## Root Cause

The race condition is in `src/PostgREST/Metrics.hs:35-41`:

```haskell
(HasqlPoolObs (SQL.ConnectionObservation _ status)) -> case status of
  SQL.ReadyForUseConnectionStatus  -> do
    incGauge poolAvailable
  SQL.InUseConnectionStatus        -> do
    decGauge poolAvailable
  SQL.TerminatedConnectionStatus  _ -> do
    decGauge poolAvailable
  SQL.ConnectingConnectionStatus -> pure ()
```

The `incGauge` and `decGauge` operations from `prometheus-client` are **not atomic**. During network instability:

1. Multiple connections transition states simultaneously
2. `decGauge` operations can occur before corresponding `incGauge` operations
3. Connections that terminate before becoming "ready" decrement without ever incrementing
4. The gauge drifts negative over time

## Impact

- **Metrics are unreliable** - monitoring/alerting based on `pool_available` produces false negatives
- **No actual pool impact** - the underlying pool works correctly; only the metric is corrupted
- **Persists until restart** - the counter never self-corrects; requires PostgREST restart

## Environment

- PostgREST version: v12.2.12
- `prometheus-client` constraint: `>= 1.1.1 && < 1.2.0`
- `hasql-pool` constraint: `>= 1.0.1 && < 1.1`
- Trigger: Network instability during Google GCE incident

## Suggested Fix

The gauge updates need to be atomic. Options include:

1. **Use STM** - Wrap the gauge in an `TVar` for atomic updates
2. **Use atomic operations** - If `prometheus-client` supports atomic inc/dec
3. **Track absolute count** - Calculate available from (total - in_use) instead of incrementing
4. **Add mutex/lock** - Protect gauge updates with a lock (less ideal for performance)

## Workaround

Restart PostgREST to reset the counter to correct values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Race condition in pool_available metric causes negative values during network instability #4622

Problem Description

Root Cause

Impact

Environment

Suggested Fix

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Race condition in pool_available metric causes negative values during network instability #4622

Description

Problem Description

Root Cause

Impact

Environment

Suggested Fix

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions