Skip to content

[CVPR 2025] Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency

Notifications You must be signed in to change notification settings

Dixin-Lab/BFVR-STC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency

🔥 For more results, visit our project page 🔥
⭐ If you found this project helpful to your projects, please help star this repo. Thanks! 🤗

Overview

Stage1
Network architecture of Stage 1 (Codebook learning).
Stage1
Network architecture of Stage 2 (Lookup transformer learning).

TL;DR: STC is a novel video face enhancement framework that efficiently solves the BFVR and de-flickering tasks.

Gallery

Blind face video restoration

Degraded Enhanced Degraded Enhanced
GIF1 GIF2 GIF3 GIF3
GIF1 GIF2 GIF3 GIF3
GIF1 GIF2 GIF3 GIF3

Brightness de-flickering

Degraded Enhanced Degraded Enhanced
GIF1 GIF2 GIF3 GIF3
GIF1 GIF2 GIF3 GIF3

Pixel de-flickering

Degraded Enhanced Degraded Enhanced
GIF1 GIF2 GIF3 GIF3
GIF1 GIF2 GIF3 GIF3

Pixel de-flickering (for synthesized talking head videos)

Degraded Enhanced Degraded Enhanced
GIF1 GIF2 GIF3 GIF3
GIF1 GIF2 GIF3 GIF3

Getting Started

Dependencies and Installation

Install required packages in environment.yaml

# git clone this repository
git clone https://github.com/Dixin-Lab/BFVR-STC
cd BFVR-STC

# create new anaconda env
conda env create -f environment.yaml
conda activate bfvr

# install python dependencies
conda install -c conda-forge dlib
conda install -c conda-forge ffmpeg

Quick Inference

Download Pre-trained Models

All pretrained models can also be automatically downloaded during the first inference. You can also download our pretrained models from Releases to the weights folder.

Training and Testing Data

VFHQ and VFHQ-Test dataset can be downloaded from the webpage. The data processing functions can be found in the utils directory.

Inference

🧑🏻 Blind Face Video Restoration

python scripts/infer_bfvr.py --input_path [video path] --output_base [output directory]

🧑🏻 Face Video Brightness De-flickering

python scripts/infer_deflicker.py --input_path [video path] --output_base [output directory]

🧑🏻 Face Video Pixel De-flickering

python scripts/infer_deflickersd.py --input_path [video path] --output_base [output directory]

Evaluation

The implementation of commonly used metrics, such as PSNR, SSIM, LPIPS, FVD, IDS and AKD, can be found in the evaluation. Face-Consistency and Flow-Score can be calculated by video evaluation benchmark EvalCrafter.

Citation

If you find our repo useful for your research, please consider citing our paper:

 @article{wang2024efficient,
 title={Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency},
 author={Yutong Wang and Jiajie Teng and Jiajiong Cao and Yuming Li and Chenguang Ma and Hongteng Xu and Dixin Luo},
 journal={arXiv preprint arXiv:2411.16468},
 year={2024}
}

Acknowledgement

The code framework is mainly modified from CodeFormer. Please refer to the original repo for more usage and documents.

Contact

If you have any question, please feel free to contact us via yutongwang1012@gmail.com.