APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative training criterions.
- Table of contents
- Introduction
- High-Level Library Structure
- Installation
- Pretrained Models Supported
- Current Recipes
- References
- Citation
The library structure is inspired from the S3PRL library. In keeping up with the terminology in S3PRL, the pretrained models are referred to as upstream models. A separate downstream model is added on the top of upstream model to be used as acoustic model for ASR training.
The library provides various runners (trainers) that take care of training acoustic models.
The runner takes as input:
asr_config: which defines parameters related to experiment such as learning rate, optimizers, epochs etc. ckpt: path to pretrained model ckpt upconfig configuration related to pretrained upstream model. get_model function which create the upstream and downstream model using the above parameters
The idea is to re-use the various pretrained models such as TERA, wav2vec through decoupled upstream and downstream models. This is enabled by writing simple scripts to load these pretrained models. Examples for these can be found in pretrained folder in the source code.
- Python 3 or above
- Required packages and their use are listed below:
torch # deep neural networks
pytorch-fast-transformers # fast clustered attention
pkwrap # lfmmi loss
librosa # audio file reading
yaml # config parser
We recommend installing the latest version of fast transformers using the following command:
pip install git+https://github.com/idiap/fast-transformers
To install Pkwrap follow the instructions here Pkwrap
At the moment we support the following pretrained models
We provide the pretrained model for trained with masked language modeling objective as described in "TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech".
The pretrained model is available here.
At the moment, we only support flat-start lattice-free MMI training. The following recipes can be found in the examples folder. For more details on how to run, follow the steps in the README files in examples
We provide recipes to train acoustic model using 100 hours of librispeech data and pretrained acoustic models based on
- Masked Acoustic Model
If you found this library useful, please cite the relevant work(s) from below
@misc{vyas2020latticefree,
title={Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models},
author={Apoorv Vyas and Srikanth Madikeri and Hervé Bourlard},
year={2020},
eprint={2012.14252},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Please note that this list is not exhaustive. We are only providing references to a few key works which this library uses. For a more exhaustive list please take a look at our published reports based on this library.
@inproceedings{paszke2019pytorch,
title = {PyTorch: An Imperative Style, High-Performance Deep Learning Library},
author = {Paszke, Adam et. al.},
booktitle = {Advances in Neural Information Processing Systems 32},
year = {2019},
}
@article{hadian2018flat,
author={Hossein Hadian and others},
title={Flat-Start Single-Stage Discriminatively Trained HMM-Based Models for ASR},
year={2018},
journal={IEEE ACM Transactions on Audio, Speech, and Language Processing},
}
@misc{madikeri2020pkwrap,
title={Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models},
author={Srikanth Madikeri and Sibo Tong and Juan Zuluaga-Gomez and Apoorv Vyas and Petr Motlicek and Herv{\'e} Bourlard},
year={2020},
eprint={2010.03466},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
@inproceedings{vyas2020fast,
author = {Vyas, Apoorv and Katharopoulos, Angelos and Fleuret, Fran\c{c}ois},
title = {Fast Transformers with Clustered Attention},
booktitle = {Proceedings of the international conference on Neural Information Processing Systems (NeurIPS)},
year = {2020}
}
@misc{
S3PRL,
author = {Andy T. Liu and Yang Shu-wen},
title = {S3PRL: The Self-Supervised Speech Pre-training and Representation Learning Toolkit},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/s3prl/s3prl}
}