captcha-solver

Simple 6obcy type captcha solver in python (tensorflow)

Example

WARNING: This script only works with captchas in this format:

Requirements

Python 3
Required pip libraries (requirements.txt)
Compiled tensorflow model (captcha-solver.model by default)
If using break-captcha-selenium.py
- Selenium (already in requirements.txt)
Else
- Downloaded captcha file

To install required packages run

$ pip install -r requirements.txt

Usage

You have two options of using trained model:

You can use selenium to open automated browser, download captcha and solve it automatically
- $ python break-captcha-selenium.py
You can break downloaded captcha image
- $ python break-captcha.py [-h] [-q] captcha-image
- Use -h for more informations

Training you own model

If you want to create your own model you will need a dataset.

Creating your own dataset

First you can check how my dataset structure looks in dataset catalog. Just unzip it, look around and copy the structure.

Follow these steps to generate your own dataset:

Download full captcha-s to dataset/img/ catalog. (Make sure they are in supported format (image above))
- You can use my tool for automatic captcha gathering -> get-set.py. It uses selenium for automatic captcha image download and creates new session (opens and closes selenium firefox) when captchas are repeating, as there is only about 5-7 unique images per session.
Run make-set.py script. This script will firstly crop and separate all letters on captchas from dataset/img/ directory and after this you will need to go through manual image labeling. In labeling process you will need to press key on you keyboard corresponding with displayed captcha part. In larger databases it can be very time consuming. WARNING: For now, this script does not support progress saving, so if you quit during labeling process and start over, it will overwrite labels.txt file and all previous labels.
Now you can copy out-labels.txt file from my dataset and if there is such a need, add more labels to it. (do not leave gaps ex. 25, 26, 29, 30)
And you are ready to train model now!

Training process

Simply run model-training.py script.

Warning

This model is not 100% accurate and has lots of problem with r and z characters. Bigger dataset will surely fix this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

captcha-solver

Example

WARNING: This script only works with captchas in this format:

Requirements

To install required packages run

Usage

Training you own model

Creating your own dataset

Training process

Warning

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
captcha-solver.model		captcha-solver.model
dataset		dataset
tests		tests
.gitignore		.gitignore
README.md		README.md
break-captcha-selenium.py		break-captcha-selenium.py
break-captcha.py		break-captcha.py
geckodriver.exe		geckodriver.exe
get-set.py		get-set.py
make-set.py		make-set.py
model-training.py		model-training.py
requirements.txt		requirements.txt

MattTheCoder-W/captcha-solver

Folders and files

Latest commit

History

Repository files navigation

captcha-solver

Example

WARNING: This script only works with captchas in this format:

Requirements

To install required packages run

Usage

Training you own model

Creating your own dataset

Training process

Warning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages