Skip to content

apedley/vocalize

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

266 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vocalize

Stories in Ready

Table of Contents

  1. About
  2. Team
  3. Dependencies
    1. Data Scraping
    2. Processing
  4. Running

About

Vocalize is a pronunciation trainer made for language learners.

Vocalize Web App Vocalize Web Results Vocalize Web Results Tech Stack

Vocalize is an application that provides pronunciation training for language learners. The user selects the language that they would like to practice, either English and Spanish, and is then presented with practice words. The user is able to record their pronunciation and submit it for comparison against the average pronunciation of the word. A visual representation of the user's pronunciation is graphed against the average pronunciation.

The average pronunciation of each word is created by feeding YouTube videos into a custom audio processing algorithm. We first scrape audio books from YouTube and submit them to IBM Watson's Text-to-Speech API. We then use FFmpeg to create an audio file for each word in the audiobook. When a word appears multiple times, we average the word instances together using a custom Python module that is built on top of SciPi. We narrow the scope of our data by only processing the 1000 most popular words of each language. Once an average pronunciation has been create for a word, it is stored using Amazon S3.

Front End: React.js, React Native, Redux, D3.js
Back End: Node.js, Express, MongoDB, Amazon S3
Audio Processing: Python, SciPy, IBM Watson, FFmpeg
Testing: Chai, Mocha, pytest
Build Tools: Gulp, Browersify, Webpack
Deployment: Digital Ocean

Team

  • Product Owner: Eugene Krayni
  • Scrum Master: Andrew Pedley
  • Development Team Members: Luke Powell, Aaron Phillips, Alex Zywiak

Dependencies

Data Scraping

  • youtube-dl brew install youtube-dl

Processing

Running

npm install
gulp build
node server.js

Data Scraping

In the data scraping directory you will find node js files that scrape youtube videos (audio books) for wav files of words.

npm install
node index.js scrape <youtube id> <language>

There is also a file that runs the python scripts to average the words and outputs them into a 'averaged' folder called average.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 44.9%
  • CSS 41.5%
  • Python 8.0%
  • HTML 3.6%
  • Objective-C 2.0%