This repository provides the official PyTorch implementation of the paper:
Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression, AAAI25 (oral)
Siqi Wu†, Yinda Chen†, Dong Liu, Zhihai He*† Equal contribution
- 2025.02.14 🤗 All pre-trained checkpoints released on HuggingFace Hub!
- 2025.01.18 🏆 CLC selected as AAAI 2025 Oral Presentation (Top 3% of submissions)!
- 2024.12.11 🎉 Paper accepted by AAAI 2025!
- 2024.11.18 💻 Core codebase officially open-sourced!
Conditional Latent Coding (CLC) is a deep image compression framework that leverages conditional coding with learnable synthesized references to achieve efficient compression. This method is built upon CompressAI and the TCM framework.
In this repository, we provide the code for:
- The CLC model (
CLC_run.py) - Data loader that supports reference clustering (
dataloader_ref_cluster.py) - Training script for CLC (
train_CLC.py)
- Python 3.6+
- PyTorch 1.7+
- torchvision
- numpy
- scikit-learn
- h5py
- matplotlib
- tqdm
- pillow
- tensorboard
- compressai
- einops
- timm
-
Clone the repository:
git clone https://github.com/ydchen0806/CLC.git cd CLC -
Install the required packages:
pip install -r requirements.txt
Alternatively, you can set up a conda environment:
conda create -n clc_env python=3.8 conda activate clc_env # Install PyTorch (modify according to your CUDA version) conda install pytorch torchvision cudatoolkit=10.2 -c pytorch # Install other dependencies pip install numpy scikit-learn h5py matplotlib tqdm pillow tensorboard compressai einops timm
-
Docker: A Docker image is available for this project. You can pull it using
docker pull registry.cn-hangzhou.aliyuncs.com/dockerhub1913/mamba0224_ydchen:latestThe CLC model requires datasets for training and reference images.
Prepare your main dataset in HDF5 format. The dataset should contain images stored in HDF5 datasets.
Example structure:
/path/to/your_dataset.h5
├── image_00001
├── image_00002
├── ...
Prepare a reference dataset, either in HDF5 format or as a directory of images. The reference images will be used for conditional coding.
Example structure for HDF5:
/path/to/reference_dataset.h5
├── ref_image_0001
├── ref_image_0002
├── ...
Or for a directory:
/path/to/reference_images/
├── ref_image_0001.jpg
├── ref_image_0002.png
├── ...
For efficient reference selection, precompute features for the reference dataset using the provided script dataloader_ref_cluster.py.
Example:
python dataloader_ref_cluster.py \
--data_path /path/to/your_dataset.h5 \
--ref_path /path/to/reference_dataset.h5 \
--feature_cache_path /path/to/save/feature_cache.pkl \
--output_base_dir /path/to/save/comparison_results \
--n_clusters 1000 \
--n_refs 3 \
--num_comparisons 10--data_path: Path to your main dataset.--ref_path: Path to your reference dataset.--feature_cache_path: Path to save the computed features.--output_base_dir: Directory to save visualization results.--n_clusters: Number of clusters to create for the reference images.--n_refs: Number of reference images to use during training.--num_comparisons: Number of sample comparisons to visualize.
To train the CLC model, use the train_CLC.py script.
Example:
python train_CLC.py \
-d /path/to/your_dataset.h5 \
--ref_path /path/to/reference_dataset.h5 \
--feature_cache_path /path/to/feature_cache.pkl \
--save_path /path/to/save/checkpoints/ \
--lambda 0.01 \
--epochs 50 \
--batch-size 8 \
--learning-rate 1e-4 \
--n_refs 3 \
--n_clusters 1000 \
--type mse \
--patch-size 256 256 \
--cuda \
--num-workers 4Explanation of the arguments:
-d,--dataset: Path to the main dataset.--ref_path: Path to the reference dataset.--feature_cache_path: Path to the precomputed feature cache.--save_path: Directory to save model checkpoints and logs.--lambda: Rate-distortion tradeoff parameter.--epochs: Number of training epochs.--batch-size: Batch size for training.--learning-rate: Learning rate.--n_refs: Number of reference images to use.--n_clusters: Number of clusters for reference images.--type: Loss type (mseorms-ssim).--patch-size: Size of image patches for training.--cuda: Use CUDA for training.--num-workers: Number of data loading workers.
Notes:
- The code supports both MSE and MS-SSIM loss functions.
- The
--save_pathdirectory will contain checkpoints and tensorboard logs. - Adjust
--lambdato trade off between bit-rate and distortion.
To evaluate the trained model, modify the train_CLC.py script or create a new evaluation script. An evaluation script will be provided in future updates.
You can download the pretrained weights in this HuggingFace Link!
If you find this code useful in your research, please consider citing:
@article{wu2025conditional,
title={Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression},
author={Wu, Siqi and Chen, Yinda and Liu, Dong and He, Zhihai},
journal={arXiv preprint arXiv:2502.09971},
year={2025}
}This code is built upon CompressAI and the TCM framework. We thank the authors for their contributions to the community.
This project is licensed under the terms of the MIT license.
For questions or comments, please open an issue on this repository or contact us at cyd0806@mail.ustc.edu.cn.

