Udacity Natural Language Processing Nanodegree

Speech Recognition with Neural Networks

This project develops a recurrent neural network that functions as part of an end-to-end (A)utomatic (S)peech (R)ecognition pipeline. It converts raw audio from LibriSpeech ASR corpus into Spectrogram or MFCC feature representations and uses them to generate transcribed text automatically.

Requirements

Download and install Git
Download and install Anaconda
Download and install FFmpeg

Data Folders

data
- LibriSpeech
  - dev-clean
  - test-clean

Set-up

Clone the project repository

git clone https://github.com/sdonatti/nd892-project-dnn-speech-recognizer

Install required Python packages

cd nd892-project-dnn-speech-recognizer
conda env create -f environment.yaml
conda activate nd892-project-dnn-speech-recognizer

Define the datasets

python flac_to_wav.py data/LibriSpeech/dev-clean
python flac_to_wav.py data/LibriSpeech/test-clean
python create_desc_json.py data/LibriSpeech/dev-clean train_corpus.json
python create_desc_json.py data/LibriSpeech/test-clean valid_corpus.json

Launch the project Jupyter Notebook

jupyter notebook vui_notebook.ipynb

License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
images		images
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
char_map.py		char_map.py
create_desc_json.py		create_desc_json.py
data_generator.py		data_generator.py
environment.yaml		environment.yaml
flac_to_wav.py		flac_to_wav.py
requirements.txt		requirements.txt
sample_models.py		sample_models.py
train_utils.py		train_utils.py
utils.py		utils.py
vui_notebook.html		vui_notebook.html
vui_notebook.ipynb		vui_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Udacity Natural Language Processing Nanodegree

Speech Recognition with Neural Networks

Requirements

Data Folders

Set-up

License

About

Uh oh!

Releases

Packages

Languages

License

sdonatti/nd892-project-dnn-speech-recognizer

Folders and files

Latest commit

History

Repository files navigation

Udacity Natural Language Processing Nanodegree

Speech Recognition with Neural Networks

Requirements

Data Folders

Set-up

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages