Wav2vec2 in PaddlePaddle

This is paddle-paddle version of Facebook's Wav2vec2.0 [1], with code and pre-trained weighted ported from Fairseq and huggingface.

Dependency

Install PaddlePaddle 2.0.1

pip install PaddlePaddle-gpu==2.0.1

Install PaddleAudio by

git clone https://github.com/PaddlePaddle/models
cd models/PaddleAudio
pip install -e .

Supported configs

name	Finetuning split	Dataset
wav2vec2-base-960h	960h	Librispeech
wav2vec2-large-960h	960h	Librispeech
wav2vec2-base-960h-lv60	960h	Librispeech + Libri-Light
wav2vec2-base-960h-lv60-self	960h	Librispeech + Libri-Light + Self Training

Quickstart

Clone the project,

git clone https://github.com/ranchlai/wav2vec2.paddle
cd wav2vec2.paddle

Run the speech recognition test with your audio file,

python test.py --device "gpu:0" --audio "LJ001-0186.wav" --config "wav2vec2-large-960h-lv60"

If successful, you will see output like this,

pred==> position of our society that a work of utility might be also a work of art if we cared to make it so

If you do not have gpu or run out of gpu memory, try cpu:

python test.py --device "cpu" --audio "LJ001-0186.wav" --config "wav2vec2-large-960h-lv60"

Reference

[1] Baevski, Alexei, et al. “Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.” Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 12449–12460.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
wav2vec2		wav2vec2
LJ001-0186.wav		LJ001-0186.wav
README.md		README.md
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wav2vec2 in PaddlePaddle

Dependency

Supported configs

Quickstart

Reference

About

Releases

Packages

Languages

ranchlai/wav2vec-2.0

Folders and files

Latest commit

History

Repository files navigation

Wav2vec2 in PaddlePaddle

Dependency

Supported configs

Quickstart

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages