Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.ckpt filter=lfs diff=lfs merge=lfs -text
65 changes: 65 additions & 0 deletions Finetuned-VQGAN/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Extreme Image Compression using Fine-tuned VQGAN Models (DCC 2024)
This repository is the official implementation of VQGAN-Comporession.

[Qi Mao](https://sites.google.com/view/qi-mao/), [Tinghan Yang](), [Yinuo Zhang](), [Zijian Wang](), [Meng Wang](https://scholar.google.com/citations?user=6vnhEIgAAAAJ&hl=zh-TW&oi=sra), [Shiqi Wang](), [Libiao Jin](), [Siwei Ma](https://scholar.google.com/citations?user=y3YqlaUAAAAJ&hl=zh-TW&oi=sra)

<p align="center">
<img src="assets/Framework.png"width="1060px"/>
<br>
<em> Figure: Overview of the proposed VQGAN-based image coding framework.</em>
</p>

## Acknowledgement
The framework is based on [VQGAN](https://github.com/CompVis/taming-transformers). We modify taming.modules.vqgan and add train.py and reconstruction.py for usage.

## Introduction
In this work, we propose a simple yet effective coding framework by introducing vector quantization (VQ)--based generative models into the image compression domain.

[[Paper](https://arxiv.org/abs/2307.08265)]

<p align="center">
<img src="assets/sub.jpg"width="1060px"/>
<br>
<em> Figure: Our Results. </em> </p>

## Installation
Our method is tested using cuda11.3 on a single A100. The preparation work mainly includes configuring the environment.
```bash
conda env create -f environment.yaml
conda activate vqgan
```
### Reconstruction
If you want reconstruction imge with pretrained model, please download [Google driver](https://drive.google.com/drive/folders/14I_RnQ3cA6etdKGPVMFdmmVgMtBTB5rn?usp=sharing) from [Baidu cloud](https://pan.baidu.com/s/1zBeWKh6vgof13iTBwtA65A?pwd=kfl7) (code: kfl7) and put in `logs/`

Some evaluation dataset can be downloaded from
[kodak dataset](http://r0k.us/graphics/kodak/) and [CLIC](http://challenge.compression.cc/tasks/) and put in `data/`
```bash
python reconstruction.py --logs_path $model_dir --dataset $dataset_name
```
An example: After evaluation on the Kodak dataset, fine tune the pre trained model of [vqgan_imagenet_f16_16384](https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/) to a codebook size of 1024.
```bash
python reconstruction.py --logs_path logs/kmeans_tune/16384_kmeans_1024_epoch/epoch1/ --dataset Kodak/
```
The result is saved at `rec/Kodak/`

### Train
Prepare the dataset according to the instructions of the original [VQGAN](https://github.com/CompVis/taming-transformers?tab=readme-ov-file#data-preparation) project, but our training involves freezing the codec and only updating the codebook for fine-tuning. You can use the following code to achieve this:
```bash
python train.py --base configs/custom_vqgan.yaml -t True --gpus 0, --is_frozen
```
The fine-tune model is saved at `logs/`

## Citation
```
@inproceedings{mao2024extreme,
title={Extreme image compression using fine-tuned vqgans},
author={Mao, Qi and Yang, Tinghan and Zhang, Yinuo and Wang, Zijian and Wang, Meng and Wang, Shiqi and Jin, Libiao and Ma, Siwei},
booktitle={2024 Data Compression Conference (DCC)},
pages={203--212},
year={2024},
organization={IEEE}
}
```

## Contact
Feel free to contact us if there is any question. (Qi Mao, qimao@cuc.edu.cn; Tinghan Yang, yangtinghan@cuc.edu.cn)
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
67 changes: 55 additions & 12 deletions reconstruction.py → Finetuned-VQGAN/reconstruction.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,40 +161,55 @@ def compute_bpp_zip(file_path, model, z):
def parse_args():
parser = argparse.ArgumentParser('', add_help=False)
parser.add_argument('--logs_path', default='logs/imagenet_f16_16384/', type=str)
parser.add_argument('--dataset', default='Kodak/', type=str)
parser.add_argument('--dataset', default='/home/CodingG/yth/dataset/Kodak/', type=str)
return parser.parse_args()

if __name__=='__main__':
args = parse_args()
torch.set_grad_enabled(False)

# Load Model
config_path = args.logs_path+'configs/model.yaml'
ckpt_path = args.logs_path+'checkpoints/last.ckpt'
config_path = args.logs_path+'configs/' + os.listdir(args.logs_path+'configs/') [0]
ckpt_path = args.logs_path+'checkpoints/'+ os.listdir(args.logs_path+'checkpoints/')[0]
model = load_model(config_path=config_path, ckpt_path=ckpt_path)

# set path
name = args.dataset.replace('/','')
rec_path = 'rec/' + args.dataset
name = args.dataset.split('/')[-2]
if args.logs_path.find('epoch') != -1:
model_name = args.logs_path.split('/')[-3]
else:
model_name = args.logs_path.split('/')[-2]
rec_path = 'rec/' + name + '/'
if not os.path.exists(rec_path):
os.makedirs(rec_path)
rec_path = rec_path + model_name + '/'
if not os.path.exists(rec_path):
os.makedirs(rec_path)
index_path = 'index/' + args.dataset

index_path = 'index/' + name + '/'
if not os.path.exists(index_path):
os.makedirs(index_path)
tmp_path = 'tmp/' + args.dataset
index_path = index_path + model_name + '/'
if not os.path.exists(index_path ):
os.makedirs(index_path)

tmp_path = 'tmp/' + name + '/'
if not os.path.exists(tmp_path):
os.makedirs(tmp_path)
tmp_path = tmp_path + model_name + "/"
if not os.path.exists(tmp_path):
os.makedirs(tmp_path)

# Load Model
config = load_config(config_path, display=False)


# read image
img_path = 'data/' + args.dataset
img_path = args.dataset
filenames = os.listdir(img_path)
file_list = []
bpp_list = []
psnr_list = []
lpips_list = []
for img in os.listdir(img_path):
file_list.append(img[:-4])
image = PIL.Image.open(img_path + img)
Expand All @@ -215,7 +230,31 @@ def parse_args():
img2 = cv2.imread(save_img)
psnr_list.append(psnr(img1, img2))

# Caculate lpips
import lpips
use_gpu = True # Whether to use GPU
spatial = True # Return a spatial map of perceptual distance.
# Linearly calibrated models (LPIPS)
loss_fn = lpips.LPIPS(net='alex', spatial=spatial) # Can also set net = 'squeeze' or 'vgg'
# loss_fn = lpips.LPIPS(net='alex', spatial=spatial, lpips=False) # Can also set net = 'squeeze' or 'vgg'
if (use_gpu):
loss_fn.cuda()
total = 0
LP = []
try:
dummy_im0 = lpips.im2tensor(lpips.load_image(img_path+img))
dummy_im1 = lpips.im2tensor(lpips.load_image(save_img))
if (use_gpu):
dummy_im0 = dummy_im0.cuda()
dummy_im1 = dummy_im1.cuda()
dist = loss_fn.forward(dummy_im0, dummy_im1)
d = dist.mean().item()
lpips_list.append(d)
except:
print(f'the image path: {img_path+img},{save_img} is wrong!')
exit()
# Arithmetic encoding
print('model.quantize.embedding.weight.size():',model.quantize.embedding.weight.size())
idx_cdf_uniform = pmf_to_cdf(get_uniform_pmf(model.quantize.embedding.weight.size(), index))
byte_stream = torchac.encode_float_cdf(cdf_float=idx_cdf_uniform, sym=index.to(dtype=torch.int16).cpu(),
check_input_bounds=True)
Expand Down Expand Up @@ -243,22 +282,26 @@ def parse_args():
num_bits = os.path.getsize(save_tmp) * 8
bpp = num_bits / num_pixel
bpp_list.append(bpp)

average_bpp = sum(bpp_list) / len(bpp_list)
average_psnr= sum(psnr_list) / len(psnr_list)
average_lpips = sum(lpips_list) / len(lpips_list)
bpp_list.append(average_bpp)
psnr_list.append(average_psnr)
lpips_list.append(average_lpips)
file_list.append('Average')
data = {
'Image Name': file_list,
'Bits Per Pixel (BPP)': bpp_list,
'PSNR Value': psnr_list
'PSNR Value': psnr_list,
'LPIPS Value': lpips_list,
}

df = pd.DataFrame(data)

# Write the DataFrame to an Excel file
output_file = 'bpp/' + name + '.xlsx'
output_file = f'bpp/{name}_{model_name}_bpp.xlsx'
df.to_excel(output_file, index=False, engine='xlsxwriter')

print(f'Finish Model:{args.logs_path} test! Avg bpp = {average_bpp} psnr = {average_psnr}')
print(f'Finish Model:{args.logs_path} test! Avg bpp = {average_bpp} psnr = {average_psnr} lpips = {average_lpips}')
print(f'Save bpp.csv to {output_file}')
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
19 changes: 0 additions & 19 deletions License.txt

This file was deleted.

81 changes: 23 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,65 +1,30 @@
# Extreme Image Compression using Fine-tuned VQGAN Models
This repository is the official implementation of VQGAN-Comporession.
# Introduction
Official Pytorch implementation for image compression based on VQGAN model includes:
* Finetuned-GAN:[Extreme Image Compression using Fine-tuned VQGAN Models](https://ieeexplore.ieee.org/document/10533792), DCC 2024, in [this folder](./Finetuned-VQGAN)
* UIGC:[Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer](https://ieeexplore.ieee.org/abstract/document/10687549), ICME 2024, in [this floder](./UIGC)

[Qi Mao](https://sites.google.com/view/qi-mao/), [Tinghan Yang](), [Yinuo Zhang](), [Zijian Wang](), [Meng Wang](https://scholar.google.com/citations?user=6vnhEIgAAAAJ&hl=zh-TW&oi=sra), [Shiqi Wang](), [Libiao Jin](), [Siwei Ma](https://scholar.google.com/citations?user=y3YqlaUAAAAJ&hl=zh-TW&oi=sra)
# :heart: Acknowledgement
The implementation is based on [VQGAN](https://github.com/CompVis/taming-transformers).

<p align="center">
<img src="assets/Framework.png"width="1060px"/>
<br>
<em> Figure: Overview of the proposed VQGAN-based image coding framework.</em>
</p>
# :clipboard: Citation
If you find this work useful for your research, please cite:

## Acknowledgement
The framework is based on [VQGAN](https://github.com/CompVis/taming-transformers). We modify taming.modules.vqgan and add train.py and reconstruction.py for usage.

## Introduction
In this work, we propose a simple yet effective coding framework by introducing vector quantization (VQ)--based generative models into the image compression domain.

[[Paper](https://arxiv.org/abs/2108.03690)]

<p align="center">
<img src="assets/sub.jpg"width="1060px"/>
<br>
<em> Figure: Our Results. </em> </p>

## Installation
Our method is tested using cuda11.3 on a single A100. The preparation work mainly includes configuring the environment.
```bash
conda env create -f environment.yaml
conda activate vqgan
```
### Reconstruction
If you want reconstruction imge with pretrained model, please download [Google driver](https://drive.google.com/drive/folders/14I_RnQ3cA6etdKGPVMFdmmVgMtBTB5rn?usp=sharing) from [Baidu cloud](https://pan.baidu.com/s/1zBeWKh6vgof13iTBwtA65A?pwd=kfl7) (code: kfl7) and put in `logs/`

Some evaluation dataset can be downloaded from
[kodak dataset](http://r0k.us/graphics/kodak/) and [CLIC](http://challenge.compression.cc/tasks/) and put in `data/`
```bash
python reconstruction.py --logs_path $model_dir --dataset $dataset_name
```
An example: After evaluation on the Kodak dataset, fine tune the pre trained model of [vqgan_imagenet_f16_16384](https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/) to a codebook size of 1024.
```bash
python reconstruction.py --logs_path logs/kmeans_tune/16384_kmeans_1024_epoch/epoch1/ --dataset Kodak/
```
The result is saved at `rec/Kodak/`

### Train
Prepare the dataset according to the instructions of the original [VQGAN](https://github.com/CompVis/taming-transformers?tab=readme-ov-file#data-preparation) project, but our training involves freezing the codec and only updating the codebook for fine-tuning. You can use the following code to achieve this:
```bash
python train.py --base configs/custom_vqgan.yaml -t True --gpus 0, --is_frozen
```
The fine-tune model is saved at `logs/`

## Citation
```
@inproceedings{wang2023extreme,
title={Extreme Generative Human-Oriented Video Coding via Motion Representation Compression},
author={Wang, Ruofan and Mao, Qi and Jia, Chuanmin and Wang, Ronggang and Ma, Siwei},
booktitle={2023 IEEE International Symposium on Circuits and Systems (ISCAS)},
pages={1--5},
year={2023},
@inproceedings{mao2024extreme,
title={Extreme image compression using fine-tuned vqgans},
author={Mao, Qi and Yang, Tinghan and Zhang, Yinuo and Wang, Zijian and Wang, Meng and Wang, Shiqi and Jin, Libiao and Ma, Siwei},
booktitle={2024 Data Compression Conference (DCC)},
pages={203--212},
year={2024},
organization={IEEE}
}
```

## Contact
Feel free to contact us if there is any question. (Qi Mao, qimao@cuc.edu.cn; Tinghan Yang, yangtinghan@cuc.edu.cn)
@inproceedings{xue2024unifying,
title={Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer},
author={Xue, Naifu and Mao, Qi and Wang, Zijian and Zhang, Yuan and Ma, Siwei},
booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
pages={1-6},
year={2024},
organization={IEEE}
}
```
1 change: 0 additions & 1 deletion Read.md

This file was deleted.

3 changes: 3 additions & 0 deletions UIGC/.idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions UIGC/.idea/UIGC.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions UIGC/.idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions UIGC/.idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions UIGC/.idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions UIGC/.idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading