Neural Style Transfer (NST)

A PyTorch implementation of neural style transfer using VGG19, featuring multi-scale pyramid optimization and L-BFGS refinement for high-quality artistic image generation.

Features

Multi-scale Pyramid Optimization: Coarse-to-fine approach for better style transfer quality
Two-stage Optimization: Adam optimizer for initial convergence, L-BFGS for final refinement
Flexible Style Control: Configurable style layers and weights
Optional Histogram Matching: Color distribution alignment with content or style image
GPU Acceleration: CUDA support for faster processing
Visualization Tools: Built-in utilities for comparing content, style, and generated images

Installation

This project uses uv for dependency management. If you don't have uv installed:

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

Then clone and set up the project:

git clone https://github.com/aidendorian/NeuralStlyeTransfer.git
cd NeuralStlyeTransfer
uv sync

Project Structure

NeuralStlyeTransfer/
├── data/
│   ├── content.jpg          # Your content images
│   └── style.jpg            # Your style reference images
├── output/                  # Generated stylized images
├── src/
│   ├── __init__.py
│   ├── image_nst.py         # Main script
│   ├── utils.py             # Core NST functions
│   └── visualization.py     # Visualization utilities
├── pyproject.toml           # Project dependencies
├── README.md
└── uv.lock

Usage

Basic Usage

Place your content and style images in the data/ directory, then run:

uv run src/image_nst.py

Configuration

Edit the parameters in src/image_nst.py:

# Style Transfer Parameters
STYLE_LAYERS = ['relu1_1', 'relu3_3', 'relu4_3']  # VGG19 layers for style
STYLE_WEIGHTS = [2.0, 1.5, 1.0]                   # Weight for each style layer
CONTENT_LAYER = 'relu4_2'                         # VGG19 layer for content

# Loss Weights
ALPHA = 1          # Content loss weight
BETA = 1e6         # Style loss weight (higher = more stylized)

# Optimization Parameters
ADAM_ITERS = 200        # Adam optimizer iterations
ADAM_LR = 0.02          # Adam learning rate
LBFGS_ITERS = 300       # L-BFGS iterations for refinement
LBFGS_LR = 1.0          # L-BFGS learning rate
PYRAMID_LEVELS = 3      # Number of pyramid scales

# Image Settings
MAX_SIDE = 1280         # Maximum image dimension (pixels)
CONTENT_PATH = 'data/content.jpg'
STYLE_PATH = 'data/style.jpg'
DEVICE = 'cuda'         # Use 'cpu' if no GPU available

# Optional: Histogram Matching
APPLY_HISTOGRAM_MATCHING = False      # Enable color distribution matching
HISTOGRAM_MATCHING_TARGET = 'content' # 'content' or 'style'

Parameters Guide

Content vs Style Balance

ALPHA (Content Weight): Controls how much the output resembles the original content
- Higher values (e.g., 10): Preserves more content structure
- Lower values (e.g., 0.1): Allows more stylistic freedom
BETA (Style Weight): Controls style transfer intensity
- Common range: 1e5 to 1e7
- Higher values: Stronger style application
- Lower values: Subtler style effects

Style Layers

The model uses VGG19 ReLU layers. Common configurations:

Shallow layers (relu1_1, relu2_1): Capture fine textures and patterns
Deep layers (relu4_1, relu5_1): Capture abstract style features
Balanced (default): Mix of shallow and deep for comprehensive style transfer

Optimization

PYRAMID_LEVELS:
- More levels (3-4): Better quality, longer processing
- Fewer levels (1-2): Faster results, may lack detail
ADAM_ITERS & LBFGS_ITERS:
- Increase for higher quality (at cost of time)
- Default values work well for most images

Histogram Matching

Enable this to match color distribution:

'content': Output colors match content image
'style': Output colors match style image
Useful when colors look washed out or oversaturated

Output

Generated images are saved to output/ with filenames encoding the parameters used:

{ALPHA}_{BETA}_{ADAM_ITERS}_{ADAM_LR}_{LBFGS_ITERS}_{LBFGS_LR}_{HISTOGRAM_MATCHING}_{TARGET}.png

Example: 1_1000000.0_200_0.02_300_1.0_False_content.png

Tips for Best Results

Start with defaults: The default parameters work well for most images
Adjust BETA first: This has the most visible impact on style intensity
Match image sizes: Similar dimensions for content and style often work better
Experiment with layers: Different style layer combinations produce varied effects
GPU recommended: Style transfer is computationally intensive; CUDA dramatically speeds up processing

Requirements

PyTorch with CUDA 12.8

All dependencies are managed by uv and specified in pyproject.toml.

Algorithm Overview

Image Pyramid: Process images at multiple scales (coarse to fine)
Adam Optimization: Fast initial convergence at each pyramid level
L-BFGS Refinement: High-quality final polish using second-order optimization
Optional Histogram Matching: Color distribution alignment post-processing

The implementation uses VGG19 feature extraction with Gram matrices for style representation and L1 loss for both content and style matching.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Style Transfer (NST)

Features

Installation

Project Structure

Usage

Basic Usage

Configuration

Parameters Guide

Content vs Style Balance

Style Layers

Optimization

Histogram Matching

Output

Tips for Best Results

Requirements

Algorithm Overview

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
output		output
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

aidendorian/NeuralStlyeTransfer

Folders and files

Latest commit

History

Repository files navigation

Neural Style Transfer (NST)

Features

Installation

Project Structure

Usage

Basic Usage

Configuration

Parameters Guide

Content vs Style Balance

Style Layers

Optimization

Histogram Matching

Output

Tips for Best Results

Requirements

Algorithm Overview

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages