Name	Name	Last commit message	Last commit date
Latest commit History 266 Commits
.dvc	.dvc
.github/workflows	.github/workflows
.vscode	.vscode
data	data
images	images
logs	logs
mlruns	mlruns
model	model
notebooks	notebooks
screenshots	screenshots
scripts	scripts
tests	tests
.dockerignore	.dockerignore
.dvcignore	.dvcignore
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENCE	LICENCE
README.md	README.md
package-lock.json	package-lock.json
package.json	package.json
requirements.txt	requirements.txt
results.txt	results.txt
setup.py	setup.py

African language Speech Recognition - Speech-to-Text

African language Speech Recognition

Introduction

Speech is the most natural communication mode for human beings. The task of speech recognition is to convert speech into a sequence of words by a computer program. Speech recognition applications enable people to use speech as another input mode to interact with applications with ease and effectively. Speech recognition interfaces in native language will enable the illiterate/semi-literate people to use the technology to greater extent without the knowledge of operating with computer keyboard or stylus. For more than three decades, a great amount of research was carried out on various aspects of speech recognition and its applications. Today many products have been developed that successfully utilize automatic speech recognition for communication between human and machines. Performance of speech recognition applications deteriorates in the presence of reverberation and even low levels of ambient noise. Robustness to noise, reverberation and characteristics of the transducer is still an unsolved problem that makes the research in the area of speech recognition still very active.

Speech recognition technology allows for hands-free control of smartphones, speakers, and even vehicles in a wide variety of languages. Companies have moved towards the goal of enabling machines to understand and respond to more and more of our verbalized commands. There are many matured speech recognition systems available, such as Google Assistant, Amazon Alexa, and Apple’s Siri. However, all of those voice assistants work for limited languages only.

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in two different countries in Africa - Ethiopia and Kenya. The design of this intelligent form requires selected people to install an app on their mobile phone, and whenever they buy food, they use their voice to activate the app to register the list of items they just bought in their own language. The intelligent systems in the app are expected to live to transcribe the speech-to-text and organize the information in an easy-to-process way in a database.

Our responsibility was to build a deep learning model that is capable of transcribing a speech to text in the Amharic language. The model we produce will be accurate and is robust against background noise.

Installation guide

Conda Enviroment

conda create --name mlenv python==3.7.5
conda activate mlenv

Installation of dependencies

git clone https://github.com/week4-SpeechRecognition/Speech-to-Text.git
cd Speech-to-Text
sudo python3 setup.py install

Docker run backend api

docker pull abelblue/api:1.0
git checkout -b backend
docker run abelblue/api:1.0

architecture

Project Structure

images:

images/ the folder where all snapshot for the project are stored.

data:

*.dvc the folder where the dataset versioned files are stored.

.dvc:

.dvc/: the folder where dvc configured for data version control.

.github:

.github/: the folder where github actions and CML workflow is integrated.

.vscode:

.vscode/: the folder where local path fix are stored.

models:

models/ the folder where model pickle files are stored.

notebooks:

notebooks/: include all notebooks for deep-learning and meta-data.

scripts:

*.py: Scripts for modularization, logging, and packaging.

root folder

requirements.txt: a text file lsiting the projet's dependancies.
README.md: Markdown text with a brief explanation of the project and the repository structure.
Dockerfile: build users can create an automated build that executes several command-line instructions in a container.

Contributors

Made with contributors-img.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

African language Speech Recognition - Speech-to-Text

Table of Contents

Introduction

Installation guide

Conda Enviroment

Installation of dependencies

Docker run backend api

architecture

Project Structure

images:

data:

.dvc:

.github:

.vscode:

models:

notebooks:

scripts:

root folder

Contributors

About

Uh oh!

Releases

Packages

Contributors 9

Uh oh!

Languages

License

speech-recognition123/Speech-to-Text

Folders and files

Latest commit

History

Repository files navigation

African language Speech Recognition - Speech-to-Text

Table of Contents

Introduction

Installation guide

Conda Enviroment

Installation of dependencies

Docker run backend api

architecture

Project Structure

images:

data:

.dvc:

.github:

.vscode:

models:

notebooks:

scripts:

root folder

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Uh oh!

Languages

Packages