Skip to content

This is a simple application to test if we can beat captchas for fun

Notifications You must be signed in to change notification settings

bharshal/captcha_spoofer

Repository files navigation

captcha_spoofer

Captchas are used to verify whether the entity accessing a particular webpage is a human or a computer.

Ususally they look like this,

alt text

I made this application just for fun to see if captchas can be beaten by the power of CV and DL. The verdict? I think so :P

How to run it:

  1. Run create_train_data.py This will parse through all the images of captchas in data folder and use opencv to detect individual symbols and crop them separately while giving them lables (which is the name of the image file). Thus we'll get our dataset.
  2. Run train.py This has the model definition (simple 4 layer CNN) and training script. As dataset is limited and model is small, training shouldn't take much time.
  3. Run test.py This will run inference on some random images from dataset and use ground truth to give accuracy.

How it works:

When creating the dataset, it reads every image where the name of the image is the ground truth. It then separates the 5 symbols in each image using CV techniques like:

Original captcha:

alt text

Erosion and morphology to smoothen the image:

alt text

Contours detection to detect the symbols:

alt text alt text

The symbols are cropped and supplied as training set.

alt text

After model has been trained, it is used for inference after doing same process as above.

P.S. If someone has more labelled data of captchas please email me.

About

This is a simple application to test if we can beat captchas for fun

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages