Introduction

In this project, I built multiple deep neural networks (simple RNN, RNN with Embedding, Bidirectional RNN, Encoder-Decoder RNN, Encoder-Decoder Bidirectional RNN with Embedding) that can function as part of an end-to-end machine translation pipeline. The pipeline accepts English text as input and returns the French translation. The pipeline consists of:

Data Preprocessing
Model Training
Model Predictions

Setup

This project requires GPU acceleration to run efficiently.

Local Machine (Option)

If you are planning to run on a local machine, I recommend only doing this option if you have a powerful GPU meant for deep learning. For instance, ASUS - ROG GU502GV 15.6" Gaming Laptop - Intel Core i7 - 16GB Memory - NVIDIA GeForce RTX 2060 - 1TB SSD + Optane - Brushed Metallic Black.

Amazon Web Services (Option)

Launch a GPU EC2 instance. For instance, you can choose to launch AWS Deep Learning AMI (Ubuntu 18.04) as a GPU instance.

Install

Python 3.5
NumPy
TensorFlow GPU 1.3.0
Keras 2.0.9
Jupyter
Cython

conda create --name machine-translation python=3.5 numpy
conda activate machine-translation
pip install Cython
pip install tensorflow-gpu==1.3.0
pip install keras==2.0.9
pip install jupyter
cd path/to/project
jupyter notebook

Machine Translation Pipeline

Data Preprocessing

Tokenization: Tokenize the words into ids
Padding: Add padding to make all the sequences the same length

Model Training

The models that were built and then trained include:

Model 1: simple RNN
Model 2: RNN with Embedding
Model 3: Bidirectional RNN
Model 4: Encoder-Decoder RNN
Model 5: Custom Encoder-Decoder Bidirectional RNN with Embedding (Final Model)

For models 1 - 4, I set them to be trained over 10 epochs with a validation split of 80% for training and 20% for validation. For model 5, I set it to be trained over 30 epochs with the same validation split as models 1 - 4. I increased the epochs for training model 5, so my model correctly predicted both sentences used for testing.

Model Predictions

Each model was trained to translate from english to french text. I included the validation accuracy as it was a good indicator to determine how well each model would do when translating from new english text it had not seen before to french text.

Model	Validation Accuracy
Model 1	61.85
Model 2	79.47
Model 3	67.07
Model 4	68.23
Model 5	95.32

By examining the table, we can determine Model 5: Custom Encoder-Decoder Bidirectional RNN with Embedding (Final Model) would perform the best when translating from new english text it had not seen before to french text. Through the testing in the project, Model 5 was most accurate.

Future Enhancements

Add the ability to translate from English text to other languages, such as Spanish, Japanese, etc.
Develop a web application or mobile application
Integrate with Audio Recognition to translate audio English text to another language, such as French, Spanish, etc

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
images		images
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
helper.py		helper.py
machine_translation.html		machine_translation.html
machine_translation.ipynb		machine_translation.ipynb
project_tests.py		project_tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Setup

Local Machine (Option)

Amazon Web Services (Option)

Install

Machine Translation Pipeline

Data Preprocessing

Model Training

Model Predictions

Future Enhancements

Resources

About

Releases

Packages

Contributors 3

Languages

License

james94/P2-Machine-Translation-NLPnd

Folders and files

Latest commit

History

Repository files navigation

Introduction

Setup

Local Machine (Option)

Amazon Web Services (Option)

Install

Machine Translation Pipeline

Data Preprocessing

Model Training

Model Predictions

Future Enhancements

Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages