Skip to content

Image Generation

Dwi Elfianto edited this page Dec 6, 2025 · 3 revisions

SwarmUI provides a powerful web interface for Stable Diffusion image generation with ComfyUI backend, supporting advanced workflows, ControlNet, LoRA, and batch generation.

Overview

Service: genai-swarmui (/srv/compose/genai/swarmui)

SwarmUI is a feature-rich interface for AI image generation using Stable Diffusion models, designed for both beginners and advanced users.

Features

  • Multiple Model Support - SD 1.5, SDXL, SD3, and custom models
  • ComfyUI Backend - Advanced workflow system with nodes
  • ControlNet - Precise control over composition and pose
  • LoRA Support - Fine-tuned model adaptations
  • Batch Generation - Queue multiple images
  • Custom Workflows - Save and share generation pipelines
  • Persistent Storage - Models and outputs preserved
  • GPU Acceleration - Fast generation with NVIDIA GPUs

Requirements

GPU Requirements

Minimum:

  • NVIDIA GPU with 6GB+ VRAM
  • Models: SD 1.5 (512x512)
  • Generation time: 10-30 seconds per image

Recommended:

  • NVIDIA GPU with 12GB+ VRAM (RTX 3080, 4080, 4090)
  • Models: SDXL (1024x1024)
  • Generation time: 5-15 seconds per image
  • Batch generation: Multiple images simultaneously

VRAM Usage by Model:

Model Resolution VRAM Speed (RTX 4090)
SD 1.5 512x512 4-6 GB 3-5 sec
SD 1.5 768x768 6-8 GB 8-12 sec
SDXL 1024x1024 8-10 GB 10-15 sec
SD3 1024x1024 10-12 GB 12-18 sec

Storage Requirements

  • Models: 2-6 GB each (SD 1.5: 2GB, SDXL: 6GB)
  • LoRAs: 10-200 MB each
  • VAEs: 300-800 MB each
  • Outputs: Variable (save what you want)

Recommended: 100GB+ free space for model collection

Configuration

Environment Variables

File: /srv/compose/genai/swarmui/.env.local

# User/group IDs
UID=1000
GID=1000

# GPU device
GPU_ID=0

# Data directories
DATA_DIR=/srv/appdata
SWARMUI_DATA=/srv/appdata/swarmui

# Domain
TRAEFIK_ACME_DOMAIN=yourdomain.com

# Port (default: 7801)
SWARMUI_PORT=7801

Directory Structure

${DATA_DIR}/
├── models/                 # Stable Diffusion models
│   ├── Stable-diffusion/  # Main models (SD 1.5, SDXL)
│   ├── Lora/              # LoRA files
│   ├── VAE/               # VAE models
│   ├── ControlNet/        # ControlNet models
│   └── embeddings/        # Textual inversions
├── output/                # Generated images
│   └── [YYYY-MM-DD]/     # Organized by date
└── swarmui/
    ├── Home/              # User settings
    ├── Data/              # App data
    ├── Back/              # Backend configs
    ├── Exts/              # Extensions
    ├── Node/              # ComfyUI custom nodes
    └── Work/              # Custom workflows

Network Configuration

  • Port: 7801 (internal and host)
  • Networks: genai (internal), proxy (Traefik)
  • Access: https://swarmui.${TRAEFIK_ACME_DOMAIN}

Deployment

Start Service

# Ensure GPU is available
docker smi

# Start SwarmUI
sudo composectl start genai-swarmui

# Check status (may take 1-2 minutes to initialize)
composectl status genai-swarmui

# View logs
composectl logs -f genai-swarmui

# Access: https://swarmui.yourdomain.com

First-Time Setup

  1. Navigate to URL: https://swarmui.yourdomain.com
  2. Wait for initialization: Backend downloads on first start (~2-3 minutes)
  3. Configure backend: Select ComfyUI backend
  4. Download models: See Model Management section below

Model Management

Download Models

Built-in Model Browser:

  1. Click "Models" tab in SwarmUI
  2. Browse available models
  3. Click "Download" on desired model
  4. Wait for download (2-6 GB)

Manual Download:

# Download to host
cd ${DATA_DIR}/models/Stable-diffusion

# Example: Realistic Vision (SD 1.5)
wget https://civitai.com/api/download/models/[model-id] -O realistic-vision-v5.safetensors

# Example: SDXL base model
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

Popular Starting Models:

SD 1.5 (2GB each):

  • Realistic Vision v5.0
  • DreamShaper 8
  • Deliberate v2

SDXL (6GB each):

  • SDXL Base 1.0
  • SDXL Refiner 1.0
  • Juggernaut XL

Model Sources

LoRA and Extensions

LoRAs (style adaptations):

cd ${DATA_DIR}/models/Lora
# Download LoRA files (.safetensors)
# Apply in SwarmUI with weight 0.5-1.0

ControlNet (composition control):

cd ${DATA_DIR}/models/ControlNet
# Download ControlNet models
# Use for pose, depth, edge guidance

Image Generation Basics

Simple Generation

  1. Select Model: Choose from dropdown (e.g., "realistic-vision-v5")
  2. Write Prompt: Describe desired image
    Positive: "a beautiful sunset over mountains, detailed, 4k"
    Negative: "blurry, low quality, deformed"
    
  3. Set Parameters:
    • Resolution: 512x512 (SD 1.5) or 1024x1024 (SDXL)
    • Steps: 20-30 (more = better quality, slower)
    • CFG Scale: 7-8 (how closely to follow prompt)
    • Seed: -1 (random) or specific number (reproducible)
  4. Click Generate: Wait 5-30 seconds

Advanced Parameters

Sampling Methods:

  • DPM++ 2M Karras - Good quality, fast
  • Euler a - Creative, varied results
  • DDIM - Stable, predictable

Resolution Guidelines:

  • SD 1.5: 512x512, 512x768, 768x512
  • SDXL: 1024x1024, 1024x1536, 1536x1024
  • Non-standard resolutions may produce artifacts

CFG Scale:

  • 1-5: Loose interpretation, creative
  • 7-8: Balanced (recommended)
  • 10-15: Strict adherence, may over-saturate

Batch Generation

Generate multiple images:

  1. Set "Batch Size": Number of images per generation
  2. Set "Batch Count": How many batches to run
  3. Total images = Batch Size × Batch Count

Example: Batch Size 4, Batch Count 3 = 12 images

Advanced Features

ComfyUI Workflows

Access ComfyUI for node-based workflows:

  1. Click "Advanced" → "ComfyUI"
  2. Build custom workflows with nodes
  3. Save workflows for reuse

Common Workflows:

  • Text-to-Image with refinement
  • Image-to-Image transformation
  • Inpainting and outpainting
  • ControlNet-guided generation

ControlNet

Control composition with reference images:

Use Cases:

  • Pose Control: Match character poses
  • Depth Map: Control 3D structure
  • Canny Edge: Line art guidance
  • Scribble: Rough sketch to image

Setup:

  1. Upload reference image
  2. Select ControlNet model (e.g., canny, depth, pose)
  3. Adjust weight (0.5-1.0)
  4. Generate with guidance

LoRA Application

Apply style modifications:

  1. Select LoRA from list
  2. Set weight (0.5-1.0 typical)
  3. Combine multiple LoRAs (keep total weight < 2.0)

Example:

  • Base: SDXL model
  • LoRA 1: "Anime style" (weight 0.7)
  • LoRA 2: "Detailed faces" (weight 0.5)

Upscaling

Increase resolution of generated images:

  1. Generate base image (512x512)
  2. Click "Upscale" or "Hi-Res Fix"
  3. Select upscale model (ESRGAN, Real-ESRGAN)
  4. Set target resolution (2x, 4x)

Prompt Engineering

Effective Prompts

Structure:

[subject], [style], [details], [quality modifiers]

Example Good Prompt:

Positive:
masterpiece, best quality, highly detailed,
photo of a serene lake at sunset,
mountains in background, reflections on water,
golden hour lighting, 8k uhd, professional photography

Negative:
low quality, blurry, deformed, ugly, bad anatomy,
watermark, signature, text

Techniques:

  • Weights: (keyword:1.2) increases importance
  • Emphasis: (keyword)++ or keyword
  • De-emphasis: [keyword] or (keyword:0.8)

Quality Modifiers

Positive:

  • masterpiece, best quality, highly detailed
  • 8k, uhd, professional, award winning
  • intricate details, sharp focus
  • beautiful, stunning, gorgeous

Negative:

  • low quality, blurry, deformed
  • ugly, bad anatomy, mutation
  • watermark, text, signature
  • duplicate, cropped, worst quality

Performance Optimization

GPU Memory Management

Check VRAM usage:

nvidia-smi
watch -n 1 nvidia-smi  # Monitor during generation

Optimize VRAM:

  • Use SD 1.5 instead of SDXL (saves 4GB)
  • Lower resolution (512x512 vs 768x768)
  • Reduce batch size
  • Close other GPU services (Ollama, etc.)

VRAM-Saving Techniques:

- Model: SD 1.5 instead of SDXL (-4GB)
- Resolution: 512x512 instead of 1024x1024 (-4GB)
- Batch size: 1 instead of 4 (-3GB)

Generation Speed

Factors:

  • GPU compute capability (newer = faster)
  • Model size (SD 1.5 faster than SDXL)
  • Resolution (smaller = faster)
  • Steps (fewer = faster, lower quality)

Optimize Speed:

  • Use DPM++ 2M Karras sampler (fast)
  • Reduce steps to 20-25
  • Lower resolution for testing
  • Use SD 1.5 for speed, SDXL for quality

Batch Processing

Generate many variations efficiently:

  1. Lock seed for consistency
  2. Vary prompt slightly
  3. Use batch generation
  4. Review and select best results

Troubleshooting

Out of Memory

# Check VRAM
nvidia-smi

# Solutions:
# 1. Use smaller model
# Switch to SD 1.5 (uses less VRAM than SDXL)

# 2. Reduce resolution
# Try 512x512 instead of 768x768

# 3. Lower batch size
# Generate 1 image at a time

# 4. Close other GPU services
sudo composectl stop genai-ollama

# 5. Restart SwarmUI
sudo composectl restart genai-swarmui

Black/Blank Images

Causes:

  • NSFW filter triggered
  • Model incompatibility
  • Incorrect VAE

Solutions:

  • Adjust prompt (remove triggers)
  • Try different model
  • Change VAE or use "None"

Slow Generation

# Verify GPU is being used
nvidia-smi

# Check logs
composectl logs genai-swarmui | grep -i gpu

# Ensure GPU_ID is correct
cat /srv/compose/genai/swarmui/.env.local

Service Won't Start

# Check logs
composectl logs genai-swarmui

# Verify GPU access
docker smi

# Check disk space
df -h ${DATA_DIR}

# Ensure port 7801 is available
sudo netstat -tlnp | grep 7801

Model Won't Load

# Check model file exists
ls -lh ${DATA_DIR}/models/Stable-diffusion/

# Verify file integrity (.safetensors format)
file ${DATA_DIR}/models/Stable-diffusion/model.safetensors

# Check logs for error
composectl logs genai-swarmui | grep -i error

# Re-download model if corrupted

Best Practices

  1. Start with SD 1.5 - Faster, less VRAM, learn basics
  2. Upgrade to SDXL - Once comfortable, for quality
  3. Use good prompts - Quality modifiers matter
  4. Experiment with samplers - Find what works for your style
  5. Save good seeds - Reproduce successful images
  6. Organize outputs - SwarmUI auto-organizes by date
  7. Backup models - Re-downloading is slow
  8. Monitor VRAM - Prevent crashes
  9. Use negative prompts - Avoid common issues

Use Cases

Photo-Realistic Images

Model: Realistic Vision v5 (SD 1.5) or Juggernaut XL
Prompt: "photo of [subject], professional photography,
         8k uhd, studio lighting, detailed"
Settings: CFG 7, Steps 25-30, DPM++ 2M Karras

Artistic Styles

Model: DreamShaper 8
Prompt: "[subject], [style] art style, digital painting,
         trending on artstation, highly detailed"
Settings: CFG 8-10, Steps 30-40

Character Design

Model: SDXL with anime LoRA
Prompt: "character design sheet, [description],
         multiple views, white background"
ControlNet: Pose guidance

Product Visualization

Model: SDXL Base
Prompt: "product photo, [product], white background,
         studio lighting, commercial photography"
Settings: High resolution, refined details

Related Documentation

Quick Reference

Service Management

# Start service
sudo composectl start genai-swarmui

# Check status
composectl status genai-swarmui

# View logs
composectl logs -f genai-swarmui

# Restart
sudo composectl restart genai-swarmui

Access and Directories

# Web UI
https://swarmui.yourdomain.com

# Model directories
${DATA_DIR}/models/Stable-diffusion/  # Main models
${DATA_DIR}/models/Lora/              # LoRA files
${DATA_DIR}/models/VAE/               # VAE models
${DATA_DIR}/models/ControlNet/        # ControlNet

# Generated images
${DATA_DIR}/output/

Quick Settings

Resolution: 512x512 (SD 1.5), 1024x1024 (SDXL)
Steps: 20-30
CFG Scale: 7-8
Sampler: DPM++ 2M Karras
Negative: low quality, blurry, deformed

Next: Plex Media Server - Media streaming setup →

Clone this wiki locally