Skip to content

State of the art overview for seq2seq models (step for step)

Notifications You must be signed in to change notification settings

hanfried/rosetta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rosetta

I found it hard to reproduce state-of-the-art seq2seq models. On the one hand, a lot of papers are without code. And code examples from the web are often outdated (using older versions of tensorflow, keras, python, ...) or incomplete/not-standalong-runnable (missing important details) or full-bloated (with lots of additional stuff) or they only concentrate on one technique but not combining them with others (like attention model but without beam search). This project is my attempt to learn and apply the state-of-the-art techniques step by step.

Roadmap

  • for toy problems, machine translation, summaries and chat botting
  • in Keras first, then Tensorflow, maybe PyTorch / Tensorflow Hub / tf.keras
  • from ground up simple model adding more higher level approaches (bytepairencodings, beam search, attentions, ...)

I'm not explaining a lot, I concentrate on implementation details here. There a lot of better tutorials outside to understand seq2seq models and their terminology.

Models step for step:

  1. Simple Model for adding and subtracting numbers end-to-end on chars
  2. Simple Model char-level end-to-end for Machine Translation
  3. Bytepairencoding embeddings instead for Machine Translation
  4. Implementing BeamSearch model
  5. BeamSearch model trained on a larger dataset
  6. Attention model with Tensorflow trained on a larger dataset
  7. Attention model trained on full en-de europarliament dataset
  8. Multiple layers attention model on full en-de dataset

Model weights:

I saved the weights for most models in a Google Drive Folder. In addition the file urls are also included in the notebooks as comments.

Usage / Installation

I'm using Python 3.6 with tensorflow 1.8.0 and keras 2.2.0. For details look into the Pipfile.lock.

I use pipenv to track all dependencies and create a virtualenv. Follow the instruction to install pipenv and then

git clone git@github.com:hanfried/rosetta.git
cd rosetta

pipenv install  --ignore-pipfile  # I haven't freezed the requirements in Pipfile, so it uses the exact versions from Pipfile.lock
pipenv run jupyter notebook

to start a jupyter notebook environment with all required modules installed and running in a virtualenv.

See also

About

State of the art overview for seq2seq models (step for step)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published