Breaking Amazon CAPTCHA with Deep Learning

What's it about?

Utilizing Keras/Tensorflow to create an end to end Deep Learning solution which will be able to perform object localization and classification for each letter of CAPTCHA image.

In many guides and tutorials for breaking letter CAPTCHA-s I found solutions which relied on OpenCV contour finding to localize letters so a DL model can perform classification. Such aproach often proved unreliable, as no matter how good the classification model is, it still depends solely on OpenCV's ability to find correct countours. Instead, I decided to replace that part of the process with a custom model which, by itself, will be able to localize letters on the image.

Data used:

Data comes in two forms:

.jpg images
.xml files

images folder contains 150 images, that I gathered, splitted into 140 for training and 10 images which will be used to test object detection on. Each image is named according to text that is displayed on it.
Each image has its corresponding XML file which contains bounding box coordinates for each letter. CAPTCHA images were labeled by myself using LabelImg image labeling tool.

Deep Learning models and Object detection:

Object detection here consists of two separate models - one for letter localization, other for letter classification:

Since the amount of letters on image is always 6, localization is done by using Convolutional Layers to extract features, and a 24 output Dense Layer without activation function for regression of bounding box coordinates.
After that, each letter's predicted bounding box is resized and passed into the Classification model, to determine its value.

**Note: This knowledge, under no circumstances should be used for nefarious or unethical reasons. This case study exists as a method to demonstrate how to label a custom dataset and train deep learning models on top of it. I am not responsible for how this code is used - use this as an opportunity to learn!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
images		images
models		models
readme_imgs		readme_imgs
.gitignore		.gitignore
README.md		README.md
captcha_localizer.ipynb		captcha_localizer.ipynb
object_detection.ipynb		object_detection.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Breaking Amazon CAPTCHA with Deep Learning

What's it about?

Data used:

Deep Learning models and Object detection:

Object detection here consists of two separate models - one for letter localization, other for letter classification:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

lsubotic/captcha-letter-recognition

Folders and files

Latest commit

History

Repository files navigation

Breaking Amazon CAPTCHA with Deep Learning

What's it about?

Data used:

Deep Learning models and Object detection:

Object detection here consists of two separate models - one for letter localization, other for letter classification:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages