A Wavelet Diffusion Framework for Accelerated Generative Modeling with Lightweight Denoisers

This paper has been accepted at FAIEMA 2025.

Abstract

Denoising diffusion models have emerged as a powerful class of deep generative models, yet they remain computationally demanding due to their iterative nature and high-dimensional input space. In this work, we propose a novel framework that integrates wavelet decomposition into diffusion-based generative models to reduce spatial redundancy and improve training and sampling efficiency. By operating in the wavelet domain, our approach enables a compact multiresolution representation of images, facilitating faster convergence and more efficient inference with minimal architectural modifications. We assess this framework using UNets and UKANs as denoising backbones across multiple diffusion models and benchmark datasets. Our experiments show that a 1-level wavelet decomposition achieves a speedup of up to three times in training, with competitive Fréchet Inception Distance (FID) scores. We further demonstrate that KAN-based architectures offer lightweight alternatives to convolutional backbones, enabling parameter-efficient generation. In-depth analysis of sampling dynamics, including the impact of implicit configurations and wavelet depth, reveals trade-offs between speed, quality, and resolution-specific sensitivity. These findings offer practical insights into the design of efficient generative models and highlight the potential of frequency-domain learning for future generative modeling research.

Architecture Overview

Figure 1: Overview of the Wavelet Diffusion Model (WDDM) architecture. The model operates in the wavelet domain, leveraging wavelet decomposition to reduce spatial redundancy and improve training efficiency. The denoising backbone can be a UNet or a KAN-based architecture, allowing for flexible and efficient generative modeling.

Samples

Figure 2: Uncurated list of samples from the unet-cifar10-lvl1 model.

Figure 3: Uncurated list of samples from the ukan-stl10-lvl1 model

🚀 Key Features

Efficient Training: Up to 3x faster training compared to standard diffusion models
Wavelet-Based Compression: Operates in wavelet domain for reduced spatial redundancy
Multiple Architectures: Supports multiple denoising backbones such as UNet and U-KAN
Flexible Framework: Compatible with DDPM, DDIM and other standard diffusion solvers
Multi-Dataset Support: Evaluated on CIFAR-10, CELEBA-HQ, and STL-10
Parameter Efficiency: Significant reduction in model parameters while maintaining quality

🔧 Installation

# Clone the repository
git clone https://github.com/markos-aivazoglou/wavelet-diffusion.git
cd wavelet-diffusion

# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

# Install the required packages
pip install -r requirements.txt

# or for CPU-only installations
pip install -r requirements-cpu.txt

📊 Datasets

The framework supports three main datasets:

CIFAR-10: 32×32 Natural images (60,000 samples)
CelebA-HQ: 256×256 facial images (30,000 samples)
STL-10: 64×64 natural images (100,000 samples)

CIFAR10 and STL10 will be automatically downloaded when first used. For CELEBA-HQ, you need to download the dataset manually and place it in the data/celeba-hq directory. The dataset can be downloaded from CelebA-HQ

🏃 Quick Start

Training a Model

To train a WDDM model, you can either run a plain python command:

python main.py \
    --model-type UNET \
    --dataset CIFAR10 \
    --num-epochs 700 \
    --train-batch-size 256 \
    --learning-rate 1e-4 \
    --wavelet-level 1 \
    --output-dir ./wddm-cifar10-unet-lvl1

or with accelerate for distributed training:

accelerate launch --config_file config/single-gpu-config.yaml main.py \
    --model-type UNET \
    --dataset CIFAR10 \
    --num-epochs 700 \
    --train-batch-size 256 \
    --learning-rate 1e-4 \
    --wavelet-level 1 \
    --output-dir ./wddm-cifar10-unet-lvl1

Sampling with our pretrained models

# Generate samples using DDPM
python wavelet_sampling.py \
    --model-dir markos-aivazoglou/wddm-ukan-cifar10-lvl1 \
    --output-dir ./generated-images \
    --model-type UKAN \
    --num-samples 2 \
    --scheduler ddpm \
    --sampling-steps 1000 \
    --prediction-type epsilon

# Generate samples using DDIM (faster)
python wavelet_sampling.py \
    --model-dir markos-aivazoglou/wddm-ukan-cifar10-lvl1 \
    --output-dir ./generated-images \
    --model-type UKAN \
    --num-samples 2 \
    --scheduler ddim \
    --sampling-steps 50 \
    --prediction-type epsilon

Using Pretrained Models

You can use the pretrained models available on Huggingface Hub or in a local directory, as long as checkpoints are saved with huggingface's model.save_pretrained("my/local/path"). For example, to load a pretrained UKAN model for CIFAR-10 and sample with 1000 DDPM sampling steps:

from models.UKAN import UKANHybrid
from diffusion.ddpm import WaveletDiffusion

# Load a pretrained model either from Huggingface Hub or local directory
model = UKANHybrid.from_pretrained("markos-aivazoglou/wddm-ukan-cifar10-lvl1")
diffusion = WaveletDiffusion(
    model=model,
    wavelet_level=1,
    prediction_type="epsilon",
    sampling_mode="ddpm",
    sampling_steps=1000
)
samples = diffusion.sample(batch_size=1)
# do something with the samples

or to load a UNet model for CelebA-HQ and sample with 15 DDIM sampling steps:

from diffusers import UNet2DModel
from diffusion.ddpm import WaveletDiffusion

# Load a pretrained model either from Huggingface Hub or local directory
model = UNet2DModel.from_pretrained("markos-aivazoglou/wddm-unet-celeba-hq-lvl1")
diffusion = WaveletDiffusion(
    model=model,
    wavelet_level=1,
    prediction_type="sample",
    sampling_mode="ddim",
    sampling_steps=15
)
samples = diffusion.sample(batch_size=1)
# do something with the samples

📝 Configuration

The framework can run on Huggingface Accelerate for distributed training and inference. Training configurations are stored in config/ directory:

single-gpu-config.yaml: Single GPU setup
multi-gpu-config.yaml: Multi-GPU distributed training

📄 License

This project is licensed under the Creative Commons Attribution Non-Commercial Share-Alike (CC-BY-NC-SA 4.0) - see the LICENSE file for details.

📚 Citation

TBA

👥 Authors

Markos Aivazoglou-Vounatsos - Pioneer Centre for AI, University of Copenhagen
Mostafa Mehdipour Ghazi - Pioneer Centre for AI, University of Copenhagen

📞 Contact

For questions feel free to contact:

Contact the authors at mav@di.ku.dk or ghazi@di.ku.dk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Wavelet Diffusion Framework for Accelerated Generative Modeling with Lightweight Denoisers

Abstract

Architecture Overview

Samples

🚀 Key Features

🔧 Installation

📊 Datasets

🏃 Quick Start

Training a Model

Sampling with our pretrained models

Using Pretrained Models

📝 Configuration

📄 License

📚 Citation

👥 Authors

📞 Contact

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
diffusion		diffusion
figures		figures
inference		inference
models		models
scripts		scripts
wavelet		wavelet
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataloading.py		dataloading.py
main.py		main.py
requirements-cpu.txt		requirements-cpu.txt
requirements.txt		requirements.txt
trainer.py		trainer.py
trainer_config.py		trainer_config.py
wavelet_sampling.py		wavelet_sampling.py

License

markos-aivazoglou/wavelet-diffusion

Folders and files

Latest commit

History

Repository files navigation

A Wavelet Diffusion Framework for Accelerated Generative Modeling with Lightweight Denoisers

Abstract

Architecture Overview

Samples

🚀 Key Features

🔧 Installation

📊 Datasets

🏃 Quick Start

Training a Model

Sampling with our pretrained models

Using Pretrained Models

📝 Configuration

📄 License

📚 Citation

👥 Authors

📞 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages