A PyTorch implementation of neural style transfer using VGG19, featuring multi-scale pyramid optimization and L-BFGS refinement for high-quality artistic image generation.
- Multi-scale Pyramid Optimization: Coarse-to-fine approach for better style transfer quality
- Two-stage Optimization: Adam optimizer for initial convergence, L-BFGS for final refinement
- Flexible Style Control: Configurable style layers and weights
- Optional Histogram Matching: Color distribution alignment with content or style image
- GPU Acceleration: CUDA support for faster processing
- Visualization Tools: Built-in utilities for comparing content, style, and generated images
This project uses uv for dependency management. If you don't have uv installed:
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | shThen clone and set up the project:
git clone https://github.com/aidendorian/NeuralStlyeTransfer.git
cd NeuralStlyeTransfer
uv syncNeuralStlyeTransfer/
├── data/
│ ├── content.jpg # Your content images
│ └── style.jpg # Your style reference images
├── output/ # Generated stylized images
├── src/
│ ├── __init__.py
│ ├── image_nst.py # Main script
│ ├── utils.py # Core NST functions
│ └── visualization.py # Visualization utilities
├── pyproject.toml # Project dependencies
├── README.md
└── uv.lock
Place your content and style images in the data/ directory, then run:
uv run src/image_nst.pyEdit the parameters in src/image_nst.py:
# Style Transfer Parameters
STYLE_LAYERS = ['relu1_1', 'relu3_3', 'relu4_3'] # VGG19 layers for style
STYLE_WEIGHTS = [2.0, 1.5, 1.0] # Weight for each style layer
CONTENT_LAYER = 'relu4_2' # VGG19 layer for content
# Loss Weights
ALPHA = 1 # Content loss weight
BETA = 1e6 # Style loss weight (higher = more stylized)
# Optimization Parameters
ADAM_ITERS = 200 # Adam optimizer iterations
ADAM_LR = 0.02 # Adam learning rate
LBFGS_ITERS = 300 # L-BFGS iterations for refinement
LBFGS_LR = 1.0 # L-BFGS learning rate
PYRAMID_LEVELS = 3 # Number of pyramid scales
# Image Settings
MAX_SIDE = 1280 # Maximum image dimension (pixels)
CONTENT_PATH = 'data/content.jpg'
STYLE_PATH = 'data/style.jpg'
DEVICE = 'cuda' # Use 'cpu' if no GPU available
# Optional: Histogram Matching
APPLY_HISTOGRAM_MATCHING = False # Enable color distribution matching
HISTOGRAM_MATCHING_TARGET = 'content' # 'content' or 'style'-
ALPHA (Content Weight): Controls how much the output resembles the original content
- Higher values (e.g., 10): Preserves more content structure
- Lower values (e.g., 0.1): Allows more stylistic freedom
-
BETA (Style Weight): Controls style transfer intensity
- Common range:
1e5to1e7 - Higher values: Stronger style application
- Lower values: Subtler style effects
- Common range:
The model uses VGG19 ReLU layers. Common configurations:
- Shallow layers (
relu1_1,relu2_1): Capture fine textures and patterns - Deep layers (
relu4_1,relu5_1): Capture abstract style features - Balanced (default): Mix of shallow and deep for comprehensive style transfer
-
PYRAMID_LEVELS:
- More levels (3-4): Better quality, longer processing
- Fewer levels (1-2): Faster results, may lack detail
-
ADAM_ITERS & LBFGS_ITERS:
- Increase for higher quality (at cost of time)
- Default values work well for most images
Enable this to match color distribution:
'content': Output colors match content image'style': Output colors match style image- Useful when colors look washed out or oversaturated
Generated images are saved to output/ with filenames encoding the parameters used:
{ALPHA}_{BETA}_{ADAM_ITERS}_{ADAM_LR}_{LBFGS_ITERS}_{LBFGS_LR}_{HISTOGRAM_MATCHING}_{TARGET}.png
Example: 1_1000000.0_200_0.02_300_1.0_False_content.png
- Start with defaults: The default parameters work well for most images
- Adjust BETA first: This has the most visible impact on style intensity
- Match image sizes: Similar dimensions for content and style often work better
- Experiment with layers: Different style layer combinations produce varied effects
- GPU recommended: Style transfer is computationally intensive; CUDA dramatically speeds up processing
- PyTorch with CUDA 12.8
All dependencies are managed by uv and specified in pyproject.toml.
- Image Pyramid: Process images at multiple scales (coarse to fine)
- Adam Optimization: Fast initial convergence at each pyramid level
- L-BFGS Refinement: High-quality final polish using second-order optimization
- Optional Histogram Matching: Color distribution alignment post-processing
The implementation uses VGG19 feature extraction with Gram matrices for style representation and L1 loss for both content and style matching.