GIF Captcha Recognition

A PyTorch-based system for recognizing Hei.Captcha GIF captchas, developed for security research purposes.

Overview

This project implements an efficient CNN model to recognize alphanumeric captchas (A-Z, 0-9) from GIF images. The system leverages the multi-frame nature of GIFs to improve accuracy through a voting mechanism.

Features

GIF Frame Extraction: Decomposes GIF captchas into individual frames
Voting Mechanism: Aggregates predictions across multiple frames for higher accuracy
Data Augmentation: Utilizes multiple GIF frames to expand training data
Efficient CNN: Lightweight convolutional neural network architecture
Complete Pipeline: Data generation, training, and evaluation scripts

Project Structure

├── generator/              # C# captcha generator
│   ├── GenerateCaptcha.csproj
│   └── Program.cs
├── src/                    # Python source code
│   ├── model.py           # CNN model definition
│   ├── dataset.py         # Dataset loaders
│   ├── train.py           # Training script
│   ├── predict.py         # Prediction script
│   ├── evaluate.py        # Evaluation script
│   └── utils.py           # Utility functions
├── data/                   # Dataset directory
│   ├── train/             # Training set
│   └── test/              # Test set
├── models/                 # Saved models
├── logs/                   # TensorBoard logs
└── requirements.txt        # Python dependencies

Installation

Python Environment

Install PyTorch referred for your system from PyTorch.org. Then install other dependencies:

pip install Pillow numpy tqdm matplotlib

.NET Environment

Ensure you have .NET 6.0 or later installed.

Usage

Step 1: Generate Dataset

Generate training and test datasets using the C# generator:

cd generator
dotnet run -- --count 10000 --output ../data/train
dotnet run -- --count 2000 --output ../data/test

Step 2: Train Model

Train the captcha recognition model:

cd src
python train.py --epochs 50 --batch-size 64

Optional arguments:

--frame-mode: Use each frame as a separate sample (data augmentation)
--lr: Learning rate (default: 0.001)
--num-workers: Number of data loading workers (default: 4)

Monitor training with TensorBoard:

tensorboard --logdir ../logs

Step 3: Evaluate Model

Evaluate the trained model on the test set:

python evaluate.py --model ../models/best_model.pth --data-dir ../data/test

Step 4: Make Predictions

Predict a single captcha:

python predict.py --model ../models/best_model.pth --image path/to/captcha.gif

Evaluate on a directory:

python predict.py --model ../models/best_model.pth --dir ../data/test

Use --no-voting to disable the voting mechanism and only use the first frame.

Technical Details

Character Set

The model recognizes 36 characters:

Digits: 0-9
Letters: A-Z (case-insensitive)

Voting Mechanism

For each GIF:

Extract all frames
Predict each frame independently
For each character position, vote on the most common prediction
Combine voted characters into final result

This approach significantly improves accuracy by leveraging the temporal redundancy in GIF captchas.

Data Format

Captcha files should be named: LABEL_UUID.gif

Example: A3K9_123456789.gif
Label: First 4 characters before underscore

License

This project is for educational and security research purposes only.

Acknowledgments

This project utilizes the Hei.Captcha library for captcha generation and PyTorch as the deep learning framework, and was developed with the assistance of LLMs including GitHub Copilot, Claude, and DeepSeek.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
generator		generator
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GIF Captcha Recognition

Overview

Features

Project Structure

Installation

Python Environment

.NET Environment

Usage

Step 1: Generate Dataset

Step 2: Train Model

Step 3: Evaluate Model

Step 4: Make Predictions

Technical Details

Character Set

Voting Mechanism

Data Format

License

Acknowledgments

About

Uh oh!

Languages

luoingly/gif-captcha-recognition

Folders and files

Latest commit

History

Repository files navigation

GIF Captcha Recognition

Overview

Features

Project Structure

Installation

Python Environment

.NET Environment

Usage

Step 1: Generate Dataset

Step 2: Train Model

Step 3: Evaluate Model

Step 4: Make Predictions

Technical Details

Character Set

Voting Mechanism

Data Format

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages