Skip to content

An overlapped handwritten signature extractor, to extract sign from scanned documents using OpenCV and scikit-image on python

Notifications You must be signed in to change notification settings

fahad-sayyed707/SIGN-EXTRACTOR-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIGN_DETECTOR

The Sign Extractor: Overlapped handwritten signature extraction from scanned documents using OpenCV and scikit-image on python.


Sample Output

Explanation: In this case, the signature extraction algorithm has extracted the signature successfully. Just a very small portion of the signature is lost as this part is not connected with the whole signature line, and hence the algorithm interprets it as a non-signature part.

- The pre-version:


Steps involved:

  • Improvisation of "Outliar Removal" module to boost the signature extraction algorithm.
  • Developing CNN based "Signature Recognition" module.
  • Developing "Signature Spoofing Detection" algorithm.
  • Developing "Signature Detector (bounding box) & Counter" module.
  • "Accuracy of detection on SigSA: On-line Handwritten Signature Database" will be calculated and shared.

Theory

Main pipe-line

The logic

The algorithm extracts the signatures from scanned documents based on "connected component analysis". In image processing, a connected components algorithm finds regions of connected pixels which have the same value.

Thus, the connected components can be found and labelled by a functionality that is provided by scikit-image library. In the scanned documents, if we can get the biggest connected components, we can get the signatures from whole documents. However, since there are undesired lines that also have big connected components, we need a threshold value to get rid of them.

Calculating the threshold value to get rid of the outliars:

The threshold value to detect the outliars has been calculated (any lines, shapes and texts are not a part of the signatures) via performing many experiments. An equaiton is obtained, which works pretty good for most of the scanned documents which are a4 sized.

Detect and remove the outliars:

Here the code parts that start on rsignextractor.py - line#55:

# experimental-based ratio calculation, modify it for your cases
# a4_small_size_outliar_constant is used as a threshold value to remove connected outliar connected pixels
# are smaller than a4_small_size_outliar_constant for A4 size scanned documents
a4_small_size_outliar_constant = ((average/constant_parameter_1)*constant_parameter_2)+constant_parameter_3
print("a4_small_size_outliar_constant: " + str(a4_small_size_outliar_constant))

In the equation x stands for scanned document size such as A4 or A0:

  • ax_small_size_outliar_constant = ((average/constant_parameter_1) * constant_parameter_2) + constant_parameter_3

It can be modified for other cases and also the for different scanned document size such as A0, A2 and so on. The constants have to be configured to modify:

  • constant_parameter_1
  • constant_parameter_2
  • constant_parameter_3

Author

fahad sayyed

About

An overlapped handwritten signature extractor, to extract sign from scanned documents using OpenCV and scikit-image on python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages