Skip to content

zhenglab/TFCU

Repository files navigation

Face Forgery Video Detection via Temporal Forgery Cue Unraveling (CVPR 2025)

Introduction | Preparation | Get Started | Paper |

Introduction

Face Forgery Video Detection (FFVD) is a critical yet challenging task in determining whether a digital facial video is authentic or forged. Existing FFVD methods typically focus on isolated spatial or coarsely fused spatiotemporal information, failing to leverage temporal forgery cues thus resulting in unsatisfactory performance. We strive to unravel these cues across three progressive levels: momentary anomaly, gradual inconsistency, and cumulative distortion. Accordingly, we design a consecutive correlate module to capture momentary anomaly cues by correlating interactions among consecutive frames. Then, we devise a future guide module to unravel inconsistency cues by iteratively aggregating historical anomaly cues and gradually propagating them into future frames. Finally, we introduce a historical review module that unravels distortion cues via momentum accumulation from future to historical frames. These three modules form our Temporal Forgery Cue Unraveling (TFCU) framework, sequentially highlighting spatial discriminative features by unraveling temporal forgery cues bidirectionally between historical and future frames. Extensive experiments and ablation studies demonstrate the effectiveness of our TFCU method, achieving state-of-the-art performance across diverse unseen datasets and manipulation methods.

Preparation

1. Environment and Dependencies:

This project is implemented with Python version >= 3.10 and CUDA version >= 11.3.

It is recommended to follow the steps below to configure the environment:

conda create -n tfcu python=3.10
conda activate tfcu
pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

2.Data Preparation:

Before training, follow the steps below to prepare the data:

  1. Download datasets.

  2. Frame and Landmarks Extraction: Extract frames and landmarks from the video files.

  3. Face Alignment and Cropping: Referring to the FTCN, RetinaFace was chosen for facial recognition, followed by cropping and alignment procedures. When multiple faces appear in the video, tracking the face with the longest appearance time for preservation.

Quickly Inference

Download weights from Baidu Cloud(code: ffvd) and put it into 'checkpoints/Final_TFCU_Model/ckpt'.

Infer a single video: Run the python Inference_demo.py.

Evaluation

Download weights from Baidu Cloud(code: ffvd) and put it into 'checkpoints/Final_TFCU_Model/ckpt' . Then run:

bash test.sh 0 1 12345 checkpoints/Final_TFCU_Model/video_level_c_lm.yaml
Celeb-DF DFDC FFIW Checkpoints
Ours 93.18% 86.05% 91.27% Baidu(code: ffvd)

Citation

@InProceedings{Guo_2025_CVPR,
    author    = {Guo, Zonghui and Liu, Yingjie and Zhang, Jie and Zheng, Haiyong and Shan, Shiguang},
    title     = {Face Forgery Video Detection via Temporal Forgery Cue Unraveling},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {7396-7405}
}

About

Face Forgery Video Detection via Temporal Forgery Cue Unraveling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published