-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathVOICE MODULE
2 lines (2 loc) · 2.98 KB
/
VOICE MODULE
1
2
VOICE GENERATION MODULE
Voice generation module plays a crucial role in the proposed system. After the detection of object, it is quite necessary to acknowledge the person about the presence of that object on his/her way. This is done by voice generation module which warns or signs the blind people by generating audio commands which are easily understood by them. After detection of object and its relative distance from the person we have to give voice commands about objects in the path of blind people. Also, if the object is very close then a warning is also issued to the blind person through voice generation module. Audio commands are generated as output. If the object is too close then it states “Warning: The object (class of object) is very close to you. Stay alert!”. Else if the object is at safer distance then then a voice is generated which says that “The object is at safer distance”. This is achieved with the help of certain libraries like pytorch, pyttsx3, pytesseract and engine. For voice generation module PYTTSX3 plays an important role. Pyttsx3 is a text-to-speech conversion library in Python. This library is compatible with both Python 2 and 3 an it works offline. An application invokes the pyttsx3.init() factory function to get a reference to a pyttsx3 Engine instance. Pyttsx3 is a tool which converts text to speech easily. Two voice modules are supported by pyttsx3. First is female and the second is male which is provided by “sapi5” for windows. An application invokes the pyttsx.init() factory function to get a reference to a pyttsx.Engine instance. During construction, a pyttsx.driver.DriverProxy object is initialized by engine which is responsible for loading a speech engine driver implementation from the pyttsx.drivers module. After construction, an engine object is used by the application to register and unregister event callbacks; produce and stop speech; get and set speech engine properties; and start and stop event loops. Sometimes there is a need to identify the hidden text in the image. For this purpose Python-tesseract is used. Python-tesseract an optical character recognition (OCR) tool for python. OCR detects the text content on images and encodes the text into language which is understood by the computer. This text detection is done by scanning and analysis of the image. Thus, the text embedded in images are recognized and “read” using Python-tesseract. Suppose there are some danger boards on road then the text and symbols hidden in the image of that board are identified and using voice generation module, warning is issued to the person. Pytorch it is primarily a machine learning library. Pytorch is mainly applied to the audio domain. Pytorch helps in loading the voice file in standard mp3 format. It also regulates the rate of audio dimension. Thus, it is used to manipulate the properties of sound like frequency, wavelength and waveform. The numerous availability of options for audio synthesis can also be verified by taking a look at the functions of Pytorch