-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
104 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
# Bark 🐶 | ||
|
||
Bark is a multi-lingual TTS model created by [Suno-AI](https://www.suno.ai/). It can generate conversational speech as well as music and sound effects. | ||
It is architecturally very similar to Google's [AudioLM](https://arxiv.org/abs/2209.03143). For more information, please refer to the [Suno-AI's repo](https://github.com/suno-ai/bark). | ||
|
||
|
||
## Acknowledgements | ||
- 👑[Suno-AI](https://www.suno.ai/) for training and open-sourcing this model. | ||
- 👑[serp-ai](https://github.com/serp-ai/bark-with-voice-clone) for controlled voice cloning. | ||
|
||
|
||
## Example Use | ||
|
||
```python | ||
text = "Hello, my name is Manmay , how are you?" | ||
|
||
from TTS.tts.configs.bark_config import BarkConfig | ||
from TTS.tts.models.bark import Bark | ||
|
||
config = BarkConfig() | ||
model = Bark.init_from_config(config) | ||
model.load_checkpoint(config, checkpoint_dir="path/to/model/dir/", eval=True) | ||
|
||
# with random speaker | ||
output_dict = model.synthesize(text, config, speaker_id="random", voice_dirs=None) | ||
|
||
# cloning a speaker. | ||
# It assumes that you have a speaker file in `bark_voices/speaker_n/speaker.wav` or `bark_voices/speaker_n/speaker.npz` | ||
output_dict = model.synthesize(text, config, speaker_id="ljspeech", voice_dirs="bark_voices/") | ||
``` | ||
|
||
Using 🐸TTS API: | ||
|
||
```python | ||
from TTS.api import TTS | ||
|
||
# Load the model to GPU | ||
# Bark is really slow on CPU, so we recommend using GPU. | ||
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True) | ||
|
||
|
||
# Cloning a new speaker | ||
# This expects to find a mp3 or wav file like `bark_voices/new_speaker/speaker.wav` | ||
# It computes the cloning values and stores in `bark_voices/new_speaker/speaker.npz` | ||
tts.tts_to_file(text="Hello, my name is Manmay , how are you?", | ||
file_path="output.wav", | ||
voice_dir="bark_voices/", | ||
speaker="ljspeech") | ||
|
||
|
||
# When you run it again it uses the stored values to generate the voice. | ||
tts.tts_to_file(text="Hello, my name is Manmay , how are you?", | ||
file_path="output.wav", | ||
voice_dir="bark_voices/", | ||
speaker="ljspeech") | ||
|
||
|
||
# random speaker | ||
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True) | ||
tts.tts_to_file("hello world", file_path="out.wav") | ||
``` | ||
|
||
Using 🐸TTS Command line: | ||
|
||
```console | ||
# cloning the `ljspeech` voice | ||
tts --model_name tts_models/multilingual/multi-dataset/bark \ | ||
--text "This is an example." \ | ||
--out_path "output.wav" \ | ||
--voice_dir bark_voices/ \ | ||
--speaker_idx "ljspeech" \ | ||
--progress_bar True | ||
|
||
# Random voice generation | ||
tts --model_name tts_models/multilingual/multi-dataset/bark \ | ||
--text "This is an example." \ | ||
--out_path "output.wav" \ | ||
--progress_bar True | ||
``` | ||
|
||
|
||
## Important resources & papers | ||
- Original Repo: https://github.com/suno-ai/bark | ||
- Cloning implementation: https://github.com/serp-ai/bark-with-voice-clone | ||
- AudioLM: https://arxiv.org/abs/2209.03143 | ||
|
||
## BarkConfig | ||
```{eval-rst} | ||
.. autoclass:: TTS.tts.configs.bark_config.BarkConfig | ||
:members: | ||
``` | ||
|
||
## BarkArgs | ||
```{eval-rst} | ||
.. autoclass:: TTS.tts.models.bark.BarkArgs | ||
:members: | ||
``` | ||
|
||
## Bark Model | ||
```{eval-rst} | ||
.. autoclass:: TTS.tts.models.bark.Bark | ||
:members: | ||
``` |