-
Notifications
You must be signed in to change notification settings - Fork 185
Closed
Description
Description
Consolidation worker leaks ~300 MiB/min when processing a large backlog of pending consolidations, eventually causing OOM on the host.
Environment
- Image: ghcr.io/vectorize-io/hindsight:0.4.13-slim
- VM: 8 GiB RAM (n2-standard-2, GCP)
- Database: Cloud SQL PostgreSQL 18
- LLM: Claude Opus 4.5 via LiteLLM proxy -> AWS Bedrock
- Embeddings: litellm provider with bedrock/amazon.titan-embed-text-v2:0
Steps to Reproduce
- Have a bank with ~7,800 memory_units and ~6,600 pending consolidations
- Start Hindsight with consolidation workers active
- Monitor container memory usage over time
Observed Behavior
Memory grows linearly at ~300 MiB/min regardless of concurrency settings:
| Time | Container RAM | System Available |
|---|---|---|
| T+0 | 5.0 GiB | 2.1 GiB |
| T+2 min | 5.6 GiB | 1.4 GiB |
| T+4 min | 7.7 GiB | 33 MiB (OOM) |
Worker stats during the leak:
Tuning Attempted (No Effect)
Reduced all concurrency settings - leak rate unchanged:
| Setting | Original | Tuned |
|---|---|---|
| LLM_MAX_CONCURRENT | 16 | 4 |
| DB_POOL_MAX_SIZE | 50 | 10 |
| DB_POOL_MIN_SIZE | 10 | 2 |
Workaround
Set Docker memory limit so the container auto-restarts before crashing the host:
Container gets killed at 6 GiB, Docker restarts it, consolidation resumes from where it left off. This allows the backlog to be processed across multiple restart cycles while keeping the host responsive.
Expected Behavior
Memory should plateau during consolidation, not grow linearly. The consolidation batch should release memory after processing each chunk of memories.
Additional Context
- The large backlog was created after a one-time embedding dimension migration (384-dim to 1024-dim)
- Normal operation (small retains/recalls) does not exhibit the leak - only sustained batch consolidation
- The full (non-slim) image has a separate memory issue: it loads PyTorch CUDA (~5-6 GiB) even on CPU-only VMs, but that is a different problem from this leak
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels