Skip to content

jschlight/node-native-ocr

 
 

Repository files navigation

Build Status Coverage

node-native-ocr

The native Node.js bindings to the Tesseract OCR project using N-API and node-addon-api.

Benefits:

  • Avoid spawning tesseract command line.
  • Asynchronous I/O: Image reading and processing in insulated event loop backed by libuv.
  • Support to read image data from JavaScript buffers.

Contributions are welcome.

Install

First of all, a g++ 4.9 compiler is required.

Before install node-native-ocr, the following dependencies should be installed

$ brew install pkg-config tesseract # mac os

Then npm install

$ npm install node-native-ocr

To Use with Electron

I have not found a way to get tesseract working in a bundled electron installer yet. This is work in progress....

If you want to use node-native-ocr with electron, use electron-rebuild, which takes care of compiling node-native-cor with the node version of your electron installation.

Usage

Recognize an Image Buffer

import {
  recognize
} from 'node-native-ocr'

import fs from 'fs-extra'

const filepath = path.join(__dirname, 'test', 'fixtures', 'node-native-ocr.jpg')

fs.readFile(filepath).then(recognize).then(console.log) // 'node-native-ocr'

recognize(image [, options])

  • image Buffer the content buffer of the image file.
  • options node-native-ocrOptions= optional

Returns Promise.<String> the recognized text if succeeded.

node-native-ocrOptions Object

{
  // @type `(String|Array.<String>)=eng`,
  //
  // Specifies language(s) used for OCR.
  //   Run `tesseract --list-langs` in command line for all supported languages.
  //   Defaults to `'eng'`.
  //
  // To specify multiple languages, use an array.
  //   English and Simplified Chinese, for example:
  // ```
  // lang: ['eng', 'chi_sim']
  // ```
  lang: 'eng'
}

Promise.reject(error)

  • error Error The JavaScript Error instance
    • code String Error code.
    • message String Error message.
    • other properties of Error.

code: ERR_READ_IMAGE

Rejects if it fails to read image data from file or buffer.

code: ERR_INIT_TESSER

Rejects if tesseract fails to initialize

Example of Using with Electron

// For details of `mainWindow: BrowserWindow`, see
// https://github.com/electron/electron/blob/master/docs/api/browser-window.md
mainWindow.capturePage({
  x: 10,
  y: 10,
  width: 100,
  height: 10

}, (data) => {
  recognize(data.toPNG()).then(console.log)
})

Compiling Troubles

For Mac OS users, if you are experiencing trouble when compiling, run the following command:

$ xcode-select --install

will resolve most problems.

Warnings:

xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance

resolver:

$ sudo xcode-select -s /Applications/Xcode.app/Contents/Developer

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 40.4%
  • JavaScript 26.0%
  • Shell 13.1%
  • Python 12.2%
  • C 5.1%
  • CMake 3.2%