Skip to content

erpankajpatel/VisionTextRecogniser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VisionTextRecogniser

iOS Vision Text Recogniser

iOS 13 is here, the Vision API has vastly improved. Additionally, the VisionKit framework has been introduced, allowing us to scan documents using the new document camera.

Vision and VisionKit Vision API came out with iOS 11. Up to now, it could only detect text and not return actual content, hence we had to bring in Core ML for the recognition part. Now that the Vision API is upgraded with iOS 13, the VNRecognizedTextObservation returns the text, its confidence level, as well as the bounding box coordinates. Furthermore, VisionKit allows us to access the system’s document camera to scan pages. VNDocumentCameraViewController is the view controller and VNDocumentCameraViewControllerDelegate is used to handle the delegate callbacks.

Launching a Document Camera The following code is used to present the document camera on the screen.

let scannerViewController = VNDocumentCameraViewController() scannerViewController.delegate = self.present(scannerViewController, animated: true) Once the scan is done and you just click Save and the following delegate method gets triggered: func documentCameraViewController(_ controller: VNDocumentCameraViewController, didFinishWith scan: VNDocumentCameraScan)

To get a particular scanned image among the multiple images, pass the index of the page in the method:scan.imageOfPage(at: index).

We can then process that image and detect the texts using the Vision API.

To process multiple images, we can loop through the scans in the delegate method like this:

for i in 0 ..< scan.pageCount { let img = scan.imageOfPage(at: i) processImage(img) }

Creating VNTextRecognitionRequest

let request = VNRecognizeTextRequest(completionHandler: nil) request.recognitionLevel = .accurate request.recognitionLanguages = ["en_US"]

The recognitionLevel could also be set to fast`, but then we'd have to deal with less accuracy.

recognitionLanguages is an array of languages passed in a priority order from left to right. We can also pass custom words that are not a part of the dictionary for Vision to recognize.

request.customWords = ["IOC", "COS"]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages