Skip to content

Docker Configuration

Dwi Elfianto edited this page Dec 6, 2025 · 3 revisions

Docker daemon and client configuration files that enhance Docker functionality and integrate GPU support for this orchestration framework.

Overview

The /srv/compose/docker directory contains essential configuration files for:

  • Docker Daemon - System-level Docker engine configuration
  • Docker Client - User-facing CLI configuration
  • User Namespaces - Security isolation mappings
  • Network Definitions - Isolated network configurations

Configuration Files

docker/config.json - Client Configuration

Location: ~/.docker/config.json (installed per-user)

Purpose: Customizes Docker CLI output formats for better readability

Configuration:

{
    "psFormat": "table {{.Names}}\t{{.Status}}\t{{.Ports}}",
    "imagesFormat": "table {{.ID}}\t{{.Size}}\t{{.Repository}}:{{.Tag}}"
}

Effects:

  • docker ps shows: Container names, status, and port mappings
  • docker images shows: Image ID, size, repository, and tag

Installation:

cp /srv/compose/docker/config.json ~/.docker/config.json

docker/system/daemon.json - Daemon Configuration

Location: /etc/docker/daemon.json (system-wide, requires sudo)

Purpose: Configures Docker engine with GPU support and security features

Configuration:

{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "userns-remap": "mipan",
    "ipv6": false
}

Key Features:

  1. NVIDIA Runtime - Sets nvidia as default runtime for GPU access
  2. User Namespace Remapping - Maps container users to unprivileged host user
  3. IPv6 Disabled - Simplifies networking (can be enabled if needed)

Installation:

sudo cp /srv/compose/docker/system/daemon.json /etc/docker/daemon.json
sudo systemctl restart docker

Warning: Changing daemon configuration requires Docker restart, affecting all containers.

docker/system/subuid & subgid - User Namespace Mappings

Locations:

  • /etc/subuid - User ID mappings
  • /etc/subgid - Group ID mappings

Purpose: Maps container UIDs/GIDs to unprivileged ranges on the host for security isolation

Format:

username:start_id:count

Example subuid:

mipan:100000:65536

This maps:

  • Container UID 0 (root) → Host UID 100000
  • Container UID 1 → Host UID 100001
  • Container UID 1000 → Host UID 101000
  • ... up to 65536 mappings

Security Benefits:

  • Container root (UID 0) runs as unprivileged user on host
  • Even if container is compromised, limited host access
  • Prevents privilege escalation attacks

Installation:

sudo cp /srv/compose/docker/system/subuid /etc/subuid
sudo cp /srv/compose/docker/system/subgid /etc/subgid
sudo systemctl restart docker

Note: Changing namespace mappings requires rebuilding all volumes with new permissions.

docker/networks.toml - Network Definitions

Location: /srv/compose/docker/networks.toml

Purpose: Defines isolated Docker networks for service categories

Networks Defined:

Network Subnet Gateway Purpose Internal
proxy 172.20.0.0/24 172.20.0.1 Internet-facing services (Traefik) No
metrics 172.21.0.0/24 172.21.0.1 Monitoring stack Yes
database 172.22.0.0/24 172.22.0.1 Data persistence layer Yes
genai 172.23.0.0/24 172.23.0.1 AI/ML services Yes
auth 172.24.0.0/24 172.24.0.1 Authentication services Yes

Internal Networks: Cannot route to external internet (security isolation)

Bootstrap Networks:

cd /srv/compose/docker
docker nbs -f networks.toml

See: Network Topology, Plugin: Network Bootstrap

Installation

1. Install Client Configuration

For current user:

mkdir -p ~/.docker
cp /srv/compose/docker/config.json ~/.docker/config.json

Verify:

docker ps
docker images

2. Install Daemon Configuration

Prerequisites: NVIDIA Container Toolkit installed (for GPU support)

Installation:

# Backup existing config
sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.backup

# Install new config
sudo cp /srv/compose/docker/system/daemon.json /etc/docker/daemon.json

# Restart Docker daemon
sudo systemctl restart docker

# Verify
docker info | grep -i runtime

Expected output:

Runtimes: nvidia runc
Default Runtime: nvidia

3. Install User Namespace Mappings

Warning: This changes how Docker handles user IDs. Existing volumes may need permission updates.

# Install mappings
sudo cp /srv/compose/docker/system/subuid /etc/subuid
sudo cp /srv/compose/docker/system/subgid /etc/subgid

# Restart Docker
sudo systemctl restart docker

# Check remapping is active
docker info | grep "userns"

Fix Volume Permissions:

# After enabling userns-remap, update volume permissions
sudo chown -R 100000:100000 /var/lib/docker/volumes/

4. Bootstrap Docker Networks

cd /srv/compose/docker
docker nbs -f networks.toml

Verify networks:

docker network ls

Expected output includes:

proxy     172.20.0.0/24
metrics   172.21.0.0/24
database  172.22.0.0/24
genai     172.23.0.0/24
auth      172.24.0.0/24

GPU Configuration

Requirements

  1. NVIDIA GPU with compute capability 6.0+
  2. NVIDIA Drivers installed on host
  3. NVIDIA Container Toolkit installed

Verify GPU Setup

# Check GPU availability
nvidia-smi

# Test Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

# Or use the custom plugin
docker smi

See: Plugin: GPU Checker for the docker smi command.

Configure GPU Device ID

Set which GPU to use in service .env.local:

GPU_ID=0  # Use first GPU (0, 1, 2, etc.)

Check available GPUs:

nvidia-smi -L

Network Isolation Strategy

Network Architecture

┌─────────────────────────────────────────┐
│              Internet                   │
└──────────────┬──────────────────────────┘
               │
         ┌─────▼─────┐
         │   Proxy   │ (172.20.0.0/24) - External
         │  Network  │
         └─────┬─────┘
               │
        ┌──────▼──────┐
        │   Traefik   │
        └──────┬──────┘
               │
    ┌──────────┼──────────┐
    │          │          │
┌───▼────┐ ┌──▼─────┐ ┌──▼───────┐
│Database│ │ GenAI  │ │  Metrics │
│ (int)  │ │  (int) │ │   (int)  │
└────────┘ └────────┘ └──────────┘
172.22/24  172.23/24   172.21/24

Benefits:

  • Service isolation (compromise doesn't spread)
  • Clear security boundaries
  • Easier firewall rules
  • Better network monitoring
  • Reduced attack surface

Connecting Services to Networks

In compose.yml:

services:
    myservice:
        networks:
            - proxy # For Traefik access
            - database # For DB access

networks:
    proxy:
        external: true
    database:
        external: true

Best Practice: Only connect to networks you need.

Troubleshooting

GPU Not Detected

Symptoms: Containers can't access GPU, nvidia-smi fails

Solutions:

# Check NVIDIA drivers
nvidia-smi

# Verify Container Toolkit installed
which nvidia-container-runtime

# Check daemon config
sudo cat /etc/docker/daemon.json | grep nvidia

# Restart Docker
sudo systemctl restart docker

# Test GPU access
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

User Namespace Issues

Symptoms: Permission denied errors, volume mount failures

Solutions:

# Check if userns-remap is active
docker info | grep "userns"

# Find mapped user
cat /etc/subuid | grep mipan

# Fix volume permissions
sudo chown -R 100000:100000 /path/to/volume

# Or disable userns-remap temporarily
# Remove "userns-remap" from daemon.json

Network Already Exists

Symptoms: docker nbs reports network exists with different config

Solutions:

# List networks
docker network ls

# Inspect network
docker network inspect proxy

# Remove and recreate (ensure no containers using it)
docker network rm proxy
docker nbs -f networks.toml

# Or skip existing networks (nbs handles this)

Docker Daemon Won't Start

Symptoms: Docker service fails after config changes

Solutions:

# Check daemon status
sudo systemctl status docker

# View logs
sudo journalctl -u docker -n 50

# Validate daemon.json syntax
python3 -m json.tool /etc/docker/daemon.json

# Restore backup if needed
sudo cp /etc/docker/daemon.json.backup /etc/docker/daemon.json
sudo systemctl restart docker

Security Considerations

User Namespace Remapping

Pros:

  • Container root has no host privileges
  • Reduces privilege escalation risks
  • Defense in depth

Cons:

  • Volume permission complexity
  • Some images may not work correctly
  • Requires careful permission management

Recommendation: Enable for production deployments.

Internal Networks

Mark networks as internal to prevent internet access:

[networks.database]
internal = true

Benefits:

  • Database can't be exploited to download malware
  • Leaked credentials can't phone home
  • Reduced attack surface

Trade-offs: Services can't download updates or access external APIs.

Best Practices

  1. Backup configs before changes - Always keep backups of working configs
  2. Test GPU access after daemon changes - Run docker smi to verify
  3. Use internal networks for sensitive services (databases, auth)
  4. Monitor network traffic - Watch for unexpected connections
  5. Regularly update Container Toolkit - Security patches for GPU support
  6. Document custom configs - Note why changes were made
  7. Use minimal permissions - Only grant necessary capabilities

Related Documentation

Quick Reference

Configuration Files

/srv/compose/docker/
├── config.json                 # Client config (docker ps/images format)
├── system/
│   ├── daemon.json            # Daemon config (GPU, userns)
│   ├── subuid                 # User namespace UID mappings
│   └── subgid                 # User namespace GID mappings
└── networks.toml              # Network definitions

Install locations:
├── ~/.docker/config.json      # Client (per-user)
├── /etc/docker/daemon.json    # Daemon (system-wide)
├── /etc/subuid                # UID mappings (system-wide)
└── /etc/subgid                # GID mappings (system-wide)

Common Commands

# Check Docker configuration
docker info

# View daemon config
sudo cat /etc/docker/daemon.json

# Restart Docker daemon
sudo systemctl restart docker

# Test GPU access
docker smi

# Bootstrap networks
cd /srv/compose/docker && docker nbs

# List networks
docker network ls

# Inspect network
docker network inspect proxy

Next: GenAI Overview - AI/ML services stack →

Clone this wiki locally