refactor: optimize Docker Compose with YAML anchors and aliases#197
refactor: optimize Docker Compose with YAML anchors and aliases#197
Conversation
- Reduce configuration repetition by ~85% using YAML anchors (&) and aliases (*) - Extract common patterns into reusable anchors: * x-common-config: shared dependencies, env files, and networking * x-huggingface-cache: HF cache environment variables (~6 vars per service) * x-auth-config: authentication configuration (~8 vars per service) * x-embedding-config: embedding model settings (~4 vars per service) * x-reranker-config: reranker settings (~7 vars per service) * x-common-volumes & x-indexer-volumes: volume mount patterns - Eliminate ~200+ lines of repetitive environment variable declarations - Improve maintainability with single source of truth for shared configs - Maintain full functionality across all services (validated with deployment test) Files optimized: - docker-compose.yml: 8 services now use shared anchors - docker-compose.openlit.yml: health check dependency pattern - docker-compose-bindmount-checkout.yml: working_dir and common configs
🤖 Augment PR SummarySummary: Refactors the project’s Docker Compose configurations to reduce repetition by introducing reusable YAML anchors/aliases. Changes:
Technical Notes: Uses Compose extension fields ( 🤖 Was this summary useful? React with 👍 or 👎 |
docker-compose.yml
Outdated
| environment: | ||
| - LLAMA_ARG_MODEL=/models/model.gguf | ||
| - LLAMA_ARG_CTX_SIZE=8192 | ||
| - LLAMA_ARG_CTX_SIZE=4096 |
There was a problem hiding this comment.
llamacpp now sets LLAMA_ARG_CTX_SIZE=4096 (was 8192) and the command forces --n-gpu-layers 0, which is a functional behavior change beyond a pure YAML refactor. Can you confirm this is intentional and captured in the PR description since it can affect output quality/perf/memory?
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
I changed to default 8192 and allow override via .env file.
| - ./models:/models:ro | ||
| command: ["--model", "/models/model.gguf", "--host", "0.0.0.0", "--port", "8080", "--no-warmup"] | ||
| command: [ "--model", "/models/model.gguf", "--host", "0.0.0.0", "--port", "8080", "--no-warmup", "--n-gpu-layers", "0" ] | ||
| deploy: |
There was a problem hiding this comment.
- Add LLAMA_ARG_CTX_SIZE environment variable support in llamacpp service - Increase default context size from 4096 to 8192 tokens for better performance - Allow overriding via .env file for different deployment scenarios - Maintains backward compatibility with existing setups
Files optimized: