Rosetta

I found it hard to reproduce state-of-the-art seq2seq models. On the one hand, a lot of papers are without code. And code examples from the web are often outdated (using older versions of tensorflow, keras, python, ...) or incomplete/not-standalong-runnable (missing important details) or full-bloated (with lots of additional stuff) or they only concentrate on one technique but not combining them with others (like attention model but without beam search). This project is my attempt to learn and apply the state-of-the-art techniques step by step.

Roadmap

for toy problems, machine translation, summaries and chat botting
in Keras first, then Tensorflow, maybe PyTorch / Tensorflow Hub / tf.keras
from ground up simple model adding more higher level approaches (bytepairencodings, beam search, attentions, ...)

I'm not explaining a lot, I concentrate on implementation details here. There a lot of better tutorials outside to understand seq2seq models and their terminology.

Models step for step:

Model weights:

I saved the weights for most models in a Google Drive Folder. In addition the file urls are also included in the notebooks as comments.

Usage / Installation

I'm using Python 3.6 with tensorflow 1.8.0 and keras 2.2.0. For details look into the Pipfile.lock.

I use pipenv to track all dependencies and create a virtualenv. Follow the instruction to install pipenv and then

git clone git@github.com:hanfried/rosetta.git
cd rosetta

pipenv install  --ignore-pipfile  # I haven't freezed the requirements in Pipfile, so it uses the exact versions from Pipfile.lock
pipenv run jupyter notebook

to start a jupyter notebook environment with all required modules installed and running in a virtualenv.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
utils		utils
.flake8		.flake8
.gitignore		.gitignore
AttentionModelForMachineTranslationWithTensorflow.ipynb		AttentionModelForMachineTranslationWithTensorflow.ipynb
AttentionModelOnFullDataset.ipynb		AttentionModelOnFullDataset.ipynb
AttentionModelWithMultipleLayers.ipynb		AttentionModelWithMultipleLayers.ipynb
BeamSearchForMachineTranslation.ipynb		BeamSearchForMachineTranslation.ipynb
BeamSearchOnLargeDataset.ipynb		BeamSearchOnLargeDataset.ipynb
BytepairencodingForMachineTranslation.ipynb		BytepairencodingForMachineTranslation.ipynb
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
SimpleModelForAddingAndSubstraction.ipynb		SimpleModelForAddingAndSubstraction.ipynb
SimpleModelForMachineTranslation.ipynb		SimpleModelForMachineTranslation.ipynb
bytepairencoding.py		bytepairencoding.py
seq2seq.py		seq2seq.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rosetta

Roadmap

Models step for step:

Model weights:

Usage / Installation

See also

About

Releases

Packages

Languages

hanfried/rosetta

Folders and files

Latest commit

History

Repository files navigation

Rosetta

Roadmap

Models step for step:

Model weights:

Usage / Installation

See also

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages