Hao Wei, Yanhui Zhou, Yiwen Jia, Chenyang Ge, Saeed Anwar, Ajmal Mian.
- 2025-11-06: We have uploaded the file pyiqa_evaluation_metric.py to help with the evaluation.
- 2025-11-03: The paper is published online. Please refer to it.
- 2025-10-29: The paper is accepted by Neural Networks.
- 2025-10-09: This repo is released.
Abstract: Perceptual image compression has shown strong potential for producing visually appealing results at low bitrates, surpassing classical standards and pixel-wise distortion-oriented neural methods. However, existing methods typically improve compression performance by incorporating explicit semantic priors, such as segmentation maps and textual features, into the encoder or decoder, which increases model complexity by adding parameters and floating-point operations. This limits the model's practicality, as image compression often occurs on resource-limited mobile devices. To alleviate this problem, we propose a lightweight perceptual Image Compression method using Implicit Semantic Priors (ICISP). We first develop an enhanced visual state space block that exploits local and global spatial dependencies to reduce redundancy. Since different frequency information contributes unequally to compression, we develop a frequency decomposition modulation block to adaptively preserve or reduce the low-frequency and high-frequency information. We establish the above blocks as the main modules of the encoder-decoder, and to further improve the perceptual quality of the reconstructed images, we develop a semantic-informed discriminator that uses implicit semantic priors from a pretrained DINOv2 encoder. Experiments on popular benchmarks show that our method achieves competitive compression performance and has significantly fewer network parameters and floating point operations than the existing state-of-the-art. We will release the code and trained models.
Architecture: We propose a lightweight model for image compression based on implicit semantic priors without adding extra parameters to the encoder or the decoder.
- Release code
We train the ICISP on the LSDIR dataset and evaluate it on the Kodak and CLIC_2020 datasets.
- Create and activate your environment
conda create -n your_env_name python=3.10.13conda activate your_env_name
- Install torch2.1.1+cu118
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
- Install compressai
pip install compressai
- Requirements
pip install tensorboard scipy opencv-python timm numpy
- Install
causal_conv1dandmambapip install https://github.com/Dao-AILab/causal-conv1d/releases/download/v1.1.3.post1/causal_conv1d-1.1.3.post1+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whlpip install https://github.com/state-spaces/mamba/releases/download/v1.1.1/mamba_ssm-1.1.1+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
| Rate Lambda | Link |
|---|---|
| 1 | model_1.pth |
| 1.5 | model_1.5.pth |
| 2.5 | model_2.5.pth |
| 5 | model_5.pth |
⚡ Before training, please give the correct path and name of training/validation datasets in the file train.py (lines 235-242) and train_gan.py (lines 244-251).
Trick: Training more epochs can improve the compression performance of the ICISP model!
- Stage 1: Train the distortion-oriented model
python train.py --cuda --N 64 --lambda 0.0067 --epochs 120 --lr_epoch 110 115 --save_path ./lambda_0.0067 --save
- Stage 2: Train the perception-oriented model
python train_gan.py --cuda --N 64 --epochs 50 --lr_epoch 40 45 --lr_epochDis 30 40 --rate_weight 2.5 --save_path ./rw_2.5 --save --checkpoint ./lambda_0.0067/0.0067checkpoint_best.pth.tar
⚡ Before testing, please give the correct path and name of testing datasets in the file eval.py (lines 75-77).
- Test the model
python eval.py --cuda --lmbda 2.5 --checkpoint ./rw_2.5/2.5checkpoint_best.pth.tar
If you find the code helpful in your research or work, please cite our work.
@article{WEI2025108279,
title = {A Lightweight Model for Perceptual Image Compression via Implicit Priors},
author = {Hao Wei and Yanhui Zhou and Yiwen Jia and Chenyang Ge and Saeed Anwar and Ajmal Mian},
journal = {Neural Networks},
pages = {108279},
year = {2025},
issn = {0893-6080},
doi = {https://doi.org/10.1016/j.neunet.2025.108279},
url = {https://www.sciencedirect.com/science/article/pii/S0893608025011608},
}
Some codes are brought from TCM, VMamba, and HiFiC. Thanks for their excellent works.

