OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.
https://nvidia.github.io/OpenSeq2Seq/
- Models for:
- Neural Machine Translation
- Automatic Speech Recognition
- Speech Synthesis
- Language Modeling
- NLP tasks (sentiment analysis)
 
- Data-parallel distributed training
- Multi-GPU
- Multi-node
 
- Mixed precision training for NVIDIA Volta/Turing GPUs
- Python >= 3.5
- TensorFlow >= 1.10
- CUDA >= 9.0, cuDNN >= 7.0
- Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)
Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.
Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.
Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.
This is a research project, not an official NVIDIA product.
- Tensor2Tensor
- Neural Machine Translation (seq2seq) Tutorial
- OpenNMT
- Neural Monkey
- Sockeye
- TF-seq2seq
- Moses
If you use OpenSeq2Seq, please cite this paper
@misc{openseq2seq,
    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
    year={2018},
    eprint={1805.10387},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
