This repository contains code for the paper "Learning normalized image densities via dual score matching".
Our code is designed to run on GPU with PyTorch. In order to run our experiments you will need the following packages: numpy
, scipy
, torch
, torchvision
, einops
, tqdm
. To reproduce the sparsity figure, you will additionally need pywt
, but it is not required otherwise.
The ImageNet64 dataset must be downloaded from the official ImageNet website. By default, the code expects the dataset to be in data/imagenet64
, but this can be changed (see line 104 in data.py
).
All figures in the paper can be reproduced by running the corresponding jupyter notebook. We provide pre-trained models on ImageNet64 (both color and grayscale) in the models
folder, as well as pre-computed log probabilities, dimensionalities, etc for the ImageNet64 validation set in the outputs
folder. These files can be regenerated by uncommenting the corresponding lines in the notebooks, and we provide the training commands to retrain the models (see below). We do not release training checkpoints for the generalization experiment due to space constraints, but they are available upon request.
Available notebooks, in chronological order of the figures in the paper:
simple_example.ipynb
: compare single and dual score matching on a scale mixture of two high-dimensional Gaussians (Figure 1),denoiser_vs_energy.ipynb
: compare denoising performance of a score and energy models (Table 1),cross_entropy.ipynb
: calculate cross-entropy (negative log likelihood) on both train and validation sets (Table 2),generalization.ipynb
: compare log probabilities calculated by models trained on non-overlapping subsets (Figure 2),energy_histogram.ipynb
: calculate distribution of log probabilities and show representative images (Figure 3),affine_and_sparsity.ipynb
: explore influence of intensity range and sparsity on log probabilities (Figure 4),dimensionality.ipynb
: log probabilities and effective dimensionalities as a function of noise level (Figure 5).
The file run_exps.py
contains the commands used to train all models used in the paper. Running python run_exps.py --print
will print the commands and exit, while python run_exps.py
will execute them sequentially. To select a subset of these commands, simply comment out the unneeded lines in the file.
All models were training on a single H100 GPU for 1M steps (which takes about 4 days on ImageNet64). Models will be saved in the models
folder (note that these will overwrite existing models unless the experiment name is changed).
Trained models can be loaded using the load_exp
function (e.g., as in energy_histogram.ipynb
). It returns a TrainingContext
object ctx
. The model can then be accessed via ctx.model
.
Log probabilities can be computed with
input: ModelInput = model_input(x, noise_level=t) # x is (B..., C, H, W) with values in [0,1]
output: ModelOutput = ctx.model(input, compute_scores=False, create_graph=False)
energy: torch.Tensor = output.energy # shape (B...,), NLL in nats
where x
is a tensor of one or several clean images x
(whose last three dimensions are of shape (C, H, W)
, with H=W=64
and C=1
for grayscale and C=3
for color images, and pixel intensity values are in [0,1]), and t
is the per-pixel noise variance to add to the images (typically 0). This returns energy (negative log likelihood) values in nats. They can be converted to log probabilities in dBs/dimension with LogTensor(energy, d).to(base="dBs", sign="logp", per_dimension=True)
.
Both space (data) and time (noise) scores can be computed by passing compute_scores=True
to the model, and accessing the data_score
and noise_score
attributes of the returned ModelOutput
object.
If you use this code in your work, please cite our paper:
@article{guth2025learning,
title={Learning normalized image densities via dual score matching},
author={Guth, Florentin and Kadkhodaie, Zahra and Simoncelli, Eero P},
journal={arXiv preprint arXiv:2506.05310},
year={2025}
}