-
Notifications
You must be signed in to change notification settings - Fork 1
Image Generation
SwarmUI provides a powerful web interface for Stable Diffusion image generation with ComfyUI backend, supporting advanced workflows, ControlNet, LoRA, and batch generation.
Service: genai-swarmui (/srv/compose/genai/swarmui)
SwarmUI is a feature-rich interface for AI image generation using Stable Diffusion models, designed for both beginners and advanced users.
- Multiple Model Support - SD 1.5, SDXL, SD3, and custom models
- ComfyUI Backend - Advanced workflow system with nodes
- ControlNet - Precise control over composition and pose
- LoRA Support - Fine-tuned model adaptations
- Batch Generation - Queue multiple images
- Custom Workflows - Save and share generation pipelines
- Persistent Storage - Models and outputs preserved
- GPU Acceleration - Fast generation with NVIDIA GPUs
Minimum:
- NVIDIA GPU with 6GB+ VRAM
- Models: SD 1.5 (512x512)
- Generation time: 10-30 seconds per image
Recommended:
- NVIDIA GPU with 12GB+ VRAM (RTX 3080, 4080, 4090)
- Models: SDXL (1024x1024)
- Generation time: 5-15 seconds per image
- Batch generation: Multiple images simultaneously
VRAM Usage by Model:
| Model | Resolution | VRAM | Speed (RTX 4090) |
|---|---|---|---|
| SD 1.5 | 512x512 | 4-6 GB | 3-5 sec |
| SD 1.5 | 768x768 | 6-8 GB | 8-12 sec |
| SDXL | 1024x1024 | 8-10 GB | 10-15 sec |
| SD3 | 1024x1024 | 10-12 GB | 12-18 sec |
- Models: 2-6 GB each (SD 1.5: 2GB, SDXL: 6GB)
- LoRAs: 10-200 MB each
- VAEs: 300-800 MB each
- Outputs: Variable (save what you want)
Recommended: 100GB+ free space for model collection
File: /srv/compose/genai/swarmui/.env.local
# User/group IDs
UID=1000
GID=1000
# GPU device
GPU_ID=0
# Data directories
DATA_DIR=/srv/appdata
SWARMUI_DATA=/srv/appdata/swarmui
# Domain
TRAEFIK_ACME_DOMAIN=yourdomain.com
# Port (default: 7801)
SWARMUI_PORT=7801${DATA_DIR}/
├── models/ # Stable Diffusion models
│ ├── Stable-diffusion/ # Main models (SD 1.5, SDXL)
│ ├── Lora/ # LoRA files
│ ├── VAE/ # VAE models
│ ├── ControlNet/ # ControlNet models
│ └── embeddings/ # Textual inversions
├── output/ # Generated images
│ └── [YYYY-MM-DD]/ # Organized by date
└── swarmui/
├── Home/ # User settings
├── Data/ # App data
├── Back/ # Backend configs
├── Exts/ # Extensions
├── Node/ # ComfyUI custom nodes
└── Work/ # Custom workflows
- Port: 7801 (internal and host)
-
Networks:
genai(internal),proxy(Traefik) -
Access:
https://swarmui.${TRAEFIK_ACME_DOMAIN}
# Ensure GPU is available
docker smi
# Start SwarmUI
sudo composectl start genai-swarmui
# Check status (may take 1-2 minutes to initialize)
composectl status genai-swarmui
# View logs
composectl logs -f genai-swarmui
# Access: https://swarmui.yourdomain.com-
Navigate to URL:
https://swarmui.yourdomain.com - Wait for initialization: Backend downloads on first start (~2-3 minutes)
- Configure backend: Select ComfyUI backend
- Download models: See Model Management section below
Built-in Model Browser:
- Click "Models" tab in SwarmUI
- Browse available models
- Click "Download" on desired model
- Wait for download (2-6 GB)
Manual Download:
# Download to host
cd ${DATA_DIR}/models/Stable-diffusion
# Example: Realistic Vision (SD 1.5)
wget https://civitai.com/api/download/models/[model-id] -O realistic-vision-v5.safetensors
# Example: SDXL base model
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensorsPopular Starting Models:
SD 1.5 (2GB each):
- Realistic Vision v5.0
- DreamShaper 8
- Deliberate v2
SDXL (6GB each):
- SDXL Base 1.0
- SDXL Refiner 1.0
- Juggernaut XL
- CivitAI: https://civitai.com/ (community models)
- HuggingFace: https://huggingface.co/models (official models)
- Model Database: Built into SwarmUI
LoRAs (style adaptations):
cd ${DATA_DIR}/models/Lora
# Download LoRA files (.safetensors)
# Apply in SwarmUI with weight 0.5-1.0ControlNet (composition control):
cd ${DATA_DIR}/models/ControlNet
# Download ControlNet models
# Use for pose, depth, edge guidance- Select Model: Choose from dropdown (e.g., "realistic-vision-v5")
-
Write Prompt: Describe desired image
Positive: "a beautiful sunset over mountains, detailed, 4k" Negative: "blurry, low quality, deformed" -
Set Parameters:
- Resolution: 512x512 (SD 1.5) or 1024x1024 (SDXL)
- Steps: 20-30 (more = better quality, slower)
- CFG Scale: 7-8 (how closely to follow prompt)
- Seed: -1 (random) or specific number (reproducible)
- Click Generate: Wait 5-30 seconds
Sampling Methods:
- DPM++ 2M Karras - Good quality, fast
- Euler a - Creative, varied results
- DDIM - Stable, predictable
Resolution Guidelines:
- SD 1.5: 512x512, 512x768, 768x512
- SDXL: 1024x1024, 1024x1536, 1536x1024
- Non-standard resolutions may produce artifacts
CFG Scale:
- 1-5: Loose interpretation, creative
- 7-8: Balanced (recommended)
- 10-15: Strict adherence, may over-saturate
Generate multiple images:
- Set "Batch Size": Number of images per generation
- Set "Batch Count": How many batches to run
- Total images = Batch Size × Batch Count
Example: Batch Size 4, Batch Count 3 = 12 images
Access ComfyUI for node-based workflows:
- Click "Advanced" → "ComfyUI"
- Build custom workflows with nodes
- Save workflows for reuse
Common Workflows:
- Text-to-Image with refinement
- Image-to-Image transformation
- Inpainting and outpainting
- ControlNet-guided generation
Control composition with reference images:
Use Cases:
- Pose Control: Match character poses
- Depth Map: Control 3D structure
- Canny Edge: Line art guidance
- Scribble: Rough sketch to image
Setup:
- Upload reference image
- Select ControlNet model (e.g., canny, depth, pose)
- Adjust weight (0.5-1.0)
- Generate with guidance
Apply style modifications:
- Select LoRA from list
- Set weight (0.5-1.0 typical)
- Combine multiple LoRAs (keep total weight < 2.0)
Example:
- Base: SDXL model
- LoRA 1: "Anime style" (weight 0.7)
- LoRA 2: "Detailed faces" (weight 0.5)
Increase resolution of generated images:
- Generate base image (512x512)
- Click "Upscale" or "Hi-Res Fix"
- Select upscale model (ESRGAN, Real-ESRGAN)
- Set target resolution (2x, 4x)
Structure:
[subject], [style], [details], [quality modifiers]
Example Good Prompt:
Positive:
masterpiece, best quality, highly detailed,
photo of a serene lake at sunset,
mountains in background, reflections on water,
golden hour lighting, 8k uhd, professional photography
Negative:
low quality, blurry, deformed, ugly, bad anatomy,
watermark, signature, text
Techniques:
- Weights: (keyword:1.2) increases importance
- Emphasis: (keyword)++ or keyword
- De-emphasis: [keyword] or (keyword:0.8)
Positive:
- masterpiece, best quality, highly detailed
- 8k, uhd, professional, award winning
- intricate details, sharp focus
- beautiful, stunning, gorgeous
Negative:
- low quality, blurry, deformed
- ugly, bad anatomy, mutation
- watermark, text, signature
- duplicate, cropped, worst quality
Check VRAM usage:
nvidia-smi
watch -n 1 nvidia-smi # Monitor during generationOptimize VRAM:
- Use SD 1.5 instead of SDXL (saves 4GB)
- Lower resolution (512x512 vs 768x768)
- Reduce batch size
- Close other GPU services (Ollama, etc.)
VRAM-Saving Techniques:
- Model: SD 1.5 instead of SDXL (-4GB)
- Resolution: 512x512 instead of 1024x1024 (-4GB)
- Batch size: 1 instead of 4 (-3GB)
Factors:
- GPU compute capability (newer = faster)
- Model size (SD 1.5 faster than SDXL)
- Resolution (smaller = faster)
- Steps (fewer = faster, lower quality)
Optimize Speed:
- Use DPM++ 2M Karras sampler (fast)
- Reduce steps to 20-25
- Lower resolution for testing
- Use SD 1.5 for speed, SDXL for quality
Generate many variations efficiently:
- Lock seed for consistency
- Vary prompt slightly
- Use batch generation
- Review and select best results
# Check VRAM
nvidia-smi
# Solutions:
# 1. Use smaller model
# Switch to SD 1.5 (uses less VRAM than SDXL)
# 2. Reduce resolution
# Try 512x512 instead of 768x768
# 3. Lower batch size
# Generate 1 image at a time
# 4. Close other GPU services
sudo composectl stop genai-ollama
# 5. Restart SwarmUI
sudo composectl restart genai-swarmuiCauses:
- NSFW filter triggered
- Model incompatibility
- Incorrect VAE
Solutions:
- Adjust prompt (remove triggers)
- Try different model
- Change VAE or use "None"
# Verify GPU is being used
nvidia-smi
# Check logs
composectl logs genai-swarmui | grep -i gpu
# Ensure GPU_ID is correct
cat /srv/compose/genai/swarmui/.env.local# Check logs
composectl logs genai-swarmui
# Verify GPU access
docker smi
# Check disk space
df -h ${DATA_DIR}
# Ensure port 7801 is available
sudo netstat -tlnp | grep 7801# Check model file exists
ls -lh ${DATA_DIR}/models/Stable-diffusion/
# Verify file integrity (.safetensors format)
file ${DATA_DIR}/models/Stable-diffusion/model.safetensors
# Check logs for error
composectl logs genai-swarmui | grep -i error
# Re-download model if corrupted- Start with SD 1.5 - Faster, less VRAM, learn basics
- Upgrade to SDXL - Once comfortable, for quality
- Use good prompts - Quality modifiers matter
- Experiment with samplers - Find what works for your style
- Save good seeds - Reproduce successful images
- Organize outputs - SwarmUI auto-organizes by date
- Backup models - Re-downloading is slow
- Monitor VRAM - Prevent crashes
- Use negative prompts - Avoid common issues
Model: Realistic Vision v5 (SD 1.5) or Juggernaut XL
Prompt: "photo of [subject], professional photography,
8k uhd, studio lighting, detailed"
Settings: CFG 7, Steps 25-30, DPM++ 2M Karras
Model: DreamShaper 8
Prompt: "[subject], [style] art style, digital painting,
trending on artstation, highly detailed"
Settings: CFG 8-10, Steps 30-40
Model: SDXL with anime LoRA
Prompt: "character design sheet, [description],
multiple views, white background"
ControlNet: Pose guidance
Model: SDXL Base
Prompt: "product photo, [product], white background,
studio lighting, commercial photography"
Settings: High resolution, refined details
- GenAI Overview - Complete AI stack
- LLM Services - Text generation for captions
- Service Management - Managing services
- GPU Configuration - GPU setup
# Start service
sudo composectl start genai-swarmui
# Check status
composectl status genai-swarmui
# View logs
composectl logs -f genai-swarmui
# Restart
sudo composectl restart genai-swarmui# Web UI
https://swarmui.yourdomain.com
# Model directories
${DATA_DIR}/models/Stable-diffusion/ # Main models
${DATA_DIR}/models/Lora/ # LoRA files
${DATA_DIR}/models/VAE/ # VAE models
${DATA_DIR}/models/ControlNet/ # ControlNet
# Generated images
${DATA_DIR}/output/Resolution: 512x512 (SD 1.5), 1024x1024 (SDXL)
Steps: 20-30
CFG Scale: 7-8
Sampler: DPM++ 2M Karras
Negative: low quality, blurry, deformed
Next: Plex Media Server - Media streaming setup →