Deepfake detection using CNN + LSTM models, served through Flask.
Repository: https://github.com/Shivam-26102003/Deepfake_facial-video_detection
- About
- Features
- Technologies
- Installation
- Usage
- Project Structure
- Models
- Web Application
- Results
- Contributing
- License
This repository implements a Facial Deepfake Detection pipeline combining:
- ResNeXt-50 CNN to extract spatial features from video frames.
- LSTM RNN to model temporal inconsistencies across frame sequences.
- A balanced multi-source dataset (FaceForensics++, DFDC, Celeb-DF).
- A Flask-based web interface for real-time video uploads and predictions.
This project uses a combination of publicly available deepfake datasets:
- 📦 FaceForensics++ – A popular dataset for facial manipulation detection.
- 📦 Deepfake Detection Challenge Dataset (DFDC) – Released by Facebook via Kaggle.
- 📦 Celeb-DF (v2) – High-quality deepfake dataset for research.
- Spatial Feature Extraction: ResNeXt-50 pretrained on ImageNet for robust embeddings.
- Temporal Analysis: LSTM captures frame-to-frame artifacts indicative of deepfakes.
- Web UI: Upload any video via browser and receive confidence scores instantly.
- Modular Design: Separate scripts for preprocessing, training, inference, and serving.
- Language: Python 3.8+
- Deep Learning: PyTorch >=1.7
- Web Framework: Flask
- Computer Vision: OpenCV, face_recognition
- Data Handling: NumPy, Pandas
- Visualization: Matplotlib
-
Clone the repo bash git clone https://github.com/Shivam-26102003/Deepfake_facial-video_detection.git cd Deepfake_facial-video_detection
-
Set up a virtual environment bash python3 -m venv venv source venv/bin/activate # macOS/Linux venv\Scripts\activate # Windows
-
Install dependencies bash pip install -r requirements.txt
-
Prepare models Download pretrained ResNeXt-50 via PyTorch Hub. Place trained LSTM checkpoint in models/ directory.
- Preprocess videos bash python scripts/preprocess.py --input_dir data/raw --output_dir data/processed
- Train the model bash python train.py --config configs/train.yaml
- Inference on a video bash python predict.py --video path/to/video.mp4
- Start the web server bash python server.py
bash
Deepfake_facial-video_detection/
├── data/ # raw and processed video datasets
├── models/ # model checkpoints
├── scripts/ # preprocessing & training scripts
│ ├── preprocess.py
│ └── train.py
├── static/ # CSS, JS, images for Flask app
├── templates/ # HTML templates for Flask app
├── server.py # Flask server entry-point
├── requirements.txt
├── LICENSE
└── README.md
- ResNeXt-50_32x4d: Extracts 2048-dimensional feature vectors per frame.
- LSTM: Single-layer, hidden size 2048, dropout 0.4, followed by a classifier.
- Classifier: FC → LeakyReLU → Softmax for binary real/fake prediction.
- Flask backend (server.py) accepts video uploads and returns predictions.
- Templates: Simple HTML form for file upload and result display.
- Static: CSS for styling and JS for frontend interactions.
Performance on Mixed Dataset (6000 videos):
Metric Value
- Accuracy 87.8%
- Precision 89.3%
- Recall 86.5%
- F1-score 87.9%
We welcome contributions to this project!
Special thanks to my teammate:
Feel free to open Issues or Pull Requests.
Distributed under the MIT License. See LICENSE for more information.