This repository contains the code implementation used in the paper "Towards General Visual-Linguistic Face Forgery Detection". VLFFD is a face forgery detection framework that combines visual and linguistic modalities, utilizing the Face Forgery Text Generator (FFTG) to produce high-quality text annotations for improved generalization and interpretability.

VLFFD/
├── data/
│ ├── FFTG_Deepfakes.json # FFTG annotations for Deepfakes dataset
│ ├── FFTG_Face2Face.json # FFTG annotations for Face2Face dataset
│ ├── FFTG_FaceSwap.json # FFTG annotations for FaceSwap dataset
│ ├── FFTG_NeuralTextures.json# FFTG annotations for NeuralTextures dataset
├── fftg_apt.py # Main script for FFTG text generation
├── raw_annotation.py # Script for raw annotation generation
FFTG is a novel annotation pipeline consisting of two main parts:
-
Raw Annotation Generation (RAG)
- Generates forgery masks by comparing real and forged images
- Assesses the forgery degree of each facial component (eyes, nose, mouth, whole face)
- Uses handcrafted features to estimate forgery types
- Combines these elements into a raw annotation
-
Annotation Refinement
- Further refines annotations using multimodal large language models (e.g., GPT-4o)
- Employs four types of prompting strategies: visual prompts, guide prompts, task description prompts, and pre-defined prompts
- raw_annotation.py: Implements the processing pipeline for generating raw text annotations.
- fftg_apt.py: Handles the FFTG pipeline and stores results in JSON format.
pip install opencv-python numpy dlib scikit-image tqdmpython raw_annotation.pypython fftg_gpt.pyThe JSON data is stored in the data/ directory, with each file corresponding to a forgery method (Deepfakes, Face2Face, etc.). Each JSON entry has the following format:
{
"/path/to/image.jpg": {
"fake_path": "/path/to/fake/image.jpg",
"real_path": "/path/to/real/image.jpg",
"description": "This image appears to be manipulated. The eyes region shows...",
"gpt_result": "..."
}
}If you use VLFFD in your research, please cite our paper:
@inproceedings{vlffd2025,
title={Towards General Visual-Linguistic Face Forgery Detection},
author={},
booktitle={CVPR},
year={2025}
}