DeepSVC: Deep Scalable Video Coding for Both Machine and Human Vision

If our open source codes are helpful for your research, please cite our paper:

@inproceedings{lin2023deepsvc,
  title={DeepSVC: Deep Scalable Video Coding for Both Machine and Human Vision},
  author={Lin, Hongbin and Chen, Bolin and Zhang, Zhichen and Lin, Jielian and Wang, Xu and Zhao, Tiesong},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  pages={9205--9214},
  year={2023}
}

Dependency

see env.txt

Test codes

Semantic Layer

The code should run with mmtracking, copy the codes in temporal_roi_align.py to selsa.py. Then run test.py with config file selsa_troialign_faster_rcnn_r50_dc5_7e_imagenetvid.py . Please see instructions in docs.

Structure and Texture Layer

Run test_video.py, please change data path in the file.

Training your own models

Semantic Layer

The code should run with mmtracking, copy the codes in temporal_roi_align.py to selsa.py. Then run train.py with config file selsa_troialign_faster_rcnn_r50_dc5_7e_imagenetvid.py . Please see instructions in docs.

Structure and Texture Layer,training the PSNR/MS-SSIM models

Download the training data. We train the models on the Vimeo90k dataset (Download link).
Run main.py to train the PSNR/MS-SSIM models. We first pretrian model with key frame coded with bpg and lambda=2048. Then load the pretrianed weights, train with key frame coded with key frame coded with AI codecs (in image_model.py). More detail see main.py.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Learner.py		Learner.py
README.md		README.md
dataset.py		dataset.py
env.txt		env.txt
image_model.py		image_model.py
main.py		main.py
modules.py		modules.py
semantic_layer.py		semantic_layer.py
temporal_roi_align.py		temporal_roi_align.py
test_video.py		test_video.py
utils.py		utils.py
video_model.py		video_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSVC: Deep Scalable Video Coding for Both Machine and Human Vision

Dependency

Test codes

Semantic Layer

Structure and Texture Layer

Training your own models

Semantic Layer

Structure and Texture Layer,training the PSNR/MS-SSIM models

About

Uh oh!

Releases

Packages

Languages

LHB116/DeepSVC

Folders and files

Latest commit

History

Repository files navigation

DeepSVC: Deep Scalable Video Coding for Both Machine and Human Vision

Dependency

Test codes

Semantic Layer

Structure and Texture Layer

Training your own models

Semantic Layer

Structure and Texture Layer,training the PSNR/MS-SSIM models

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages