Skip to content

Commit

Permalink
Merge pull request #7 from idiap/dev
Browse files Browse the repository at this point in the history
v0.23.0
  • Loading branch information
eginhard authored Apr 18, 2024
2 parents 3327b47 + 5527f70 commit 45abf5a
Show file tree
Hide file tree
Showing 102 changed files with 801 additions and 540 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ on:
jobs:
docker-build:
name: "Build and push Docker image"
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
strategy:
matrix:
arch: ["amd64"]
base:
- "nvidia/cuda:11.8.0-base-ubuntu22.04" # GPU enabled
- "python:3.10.8-slim" # CPU only
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Log in to the Container registry
uses: docker/login-action@v1
with:
Expand All @@ -29,11 +29,11 @@ jobs:
id: compute-tag
run: |
set -ex
base="ghcr.io/coqui-ai/tts"
base="ghcr.io/idiap/coqui-tts"
tags="" # PR build
if [[ ${{ matrix.base }} = "python:3.10.8-slim" ]]; then
base="ghcr.io/coqui-ai/tts-cpu"
base="ghcr.io/idiap/coqui-tts-cpu"
fi
if [[ "${{ startsWith(github.ref, 'refs/heads/') }}" = "true" ]]; then
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/pypi-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ defaults:
bash
jobs:
build-sdist:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Verify tag matches version
Expand All @@ -33,7 +33,7 @@ jobs:
name: sdist
path: dist/*.tar.gz
build-wheels:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11"]
Expand All @@ -55,7 +55,7 @@ jobs:
name: wheel-${{ matrix.python-version }}
path: dist/*-manylinux*.whl
publish-artifacts:
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
needs: [build-sdist, build-wheels]
environment:
name: release
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/style_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ jobs:
python-version: [3.9]
experimental: [false]
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
architecture: x64
Expand Down
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ authors:
version: 1.4
doi: 10.5281/zenodo.6334862
license: "MPL-2.0"
url: "https://github.com/eginhard/coqui-tts"
repository-code: "https://github.com/eginhard/coqui-tts"
url: "https://github.com/idiap/coqui-ai-TTS"
repository-code: "https://github.com/idiap/coqui-ai-TTS"
keywords:
- machine learning
- deep learning
Expand Down
24 changes: 12 additions & 12 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Welcome to the 🐸TTS!

This repository is governed by [the Contributor Covenant Code of Conduct](https://github.com/eginhard/coqui-tts/blob/main/CODE_OF_CONDUCT.md).
This repository is governed by [the Contributor Covenant Code of Conduct](https://github.com/idiap/coqui-ai-TTS/blob/main/CODE_OF_CONDUCT.md).

## Where to start.
We welcome everyone who likes to contribute to 🐸TTS.
Expand All @@ -15,13 +15,13 @@ If you like to contribute code, squash a bug but if you don't know where to star

You can pick something out of our road map. We keep the progess of the project in this simple issue thread. It has new model proposals or developmental updates etc.

- [Github Issues Tracker](https://github.com/eginhard/coqui-tts/issues)
- [Github Issues Tracker](https://github.com/idiap/coqui-ai-TTS/issues)

This is a place to find feature requests, bugs.

Issues with the ```good first issue``` tag are good place for beginners to take on.

-**PR**[pages](https://github.com/eginhard/coqui-tts/pulls) with the ```🚀new version``` tag.
-**PR**[pages](https://github.com/idiap/coqui-ai-TTS/pulls) with the ```🚀new version``` tag.

We list all the target improvements for the next version. You can pick one of them and start contributing.

Expand All @@ -46,14 +46,14 @@ Let us know if you encounter a problem along the way.

The following steps are tested on an Ubuntu system.

1. Fork 🐸TTS[https://github.com/eginhard/coqui-tts] by clicking the fork button at the top right corner of the project page.
1. Fork 🐸TTS[https://github.com/idiap/coqui-ai-TTS] by clicking the fork button at the top right corner of the project page.

2. Clone 🐸TTS and add the main repo as a new remote named ```upstream```.

```bash
$ git clone git@github.com:<your Github name>/coqui-tts.git
$ cd coqui-tts
$ git remote add upstream https://github.com/eginhard/coqui-tts.git
$ git clone git@github.com:<your Github name>/coqui-ai-TTS.git
$ cd coqui-ai-TTS
$ git remote add upstream https://github.com/idiap/coqui-ai-TTS.git
```

3. Install 🐸TTS for development.
Expand Down Expand Up @@ -124,22 +124,22 @@ The following steps are tested on an Ubuntu system.
13. Let's discuss until it is perfect. 💪

We might ask you for certain changes that would appear in the ✨**PR**'s page under 🐸TTS[https://github.com/eginhard/coqui-tts/pulls].
We might ask you for certain changes that would appear in the ✨**PR**'s page under 🐸TTS[https://github.com/idiap/coqui-ai-TTS/pulls].
14. Once things look perfect, We merge it to the ```dev``` branch and make it ready for the next version.
## Development in Docker container
If you prefer working within a Docker container as your development environment, you can do the following:
1. Fork 🐸TTS[https://github.com/eginhard/coqui-tts] by clicking the fork button at the top right corner of the project page.
1. Fork 🐸TTS[https://github.com/idiap/coqui-ai-TTS] by clicking the fork button at the top right corner of the project page.
2. Clone 🐸TTS and add the main repo as a new remote named ```upsteam```.
```bash
$ git clone git@github.com:<your Github name>/coqui-tts.git
$ cd coqui-tts
$ git remote add upstream https://github.com/eginhard/coqui-tts.git
$ git clone git@github.com:<your Github name>/coqui-ai-TTS.git
$ cd coqui-ai-TTS
$ git remote add upstream https://github.com/idiap/coqui-ai-TTS.git
```
3. Build the Docker Image as your development environment (it installs all of the dependencies for you):
Expand Down
33 changes: 17 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@

## 🐸Coqui.ai News
## 🐸Coqui TTS News
- 📣 Fork of the [original, unmaintained repository](https://github.com/coqui-ai/TTS). New PyPI package: [coqui-tts](https://pypi.org/project/coqui-tts)
- 📣 ⓍTTSv2 is here with 16 languages and better performance across the board.
- 📣 ⓍTTS fine-tuning code is out. Check the [example recipes](https://github.com/eginhard/coqui-tts/tree/dev/recipes/ljspeech).
- 📣 ⓍTTS fine-tuning code is out. Check the [example recipes](https://github.com/idiap/coqui-ai-TTS/tree/dev/recipes/ljspeech).
- 📣 ⓍTTS can now stream with <200ms latency.
- 📣 ⓍTTS, our production TTS model that can speak 13 languages, is released [Blog Post](https://coqui.ai/blog/tts/open_xtts), [Demo](https://huggingface.co/spaces/coqui/xtts), [Docs](https://coqui-tts.readthedocs.io/en/dev/models/xtts.html)
- 📣 [🐶Bark](https://github.com/suno-ai/bark) is now available for inference with unconstrained voice cloning. [Docs](https://coqui-tts.readthedocs.io/en/dev/models/bark.html)
Expand All @@ -11,7 +12,7 @@
<div align="center">
<img src="https://static.scarf.sh/a.png?x-pxid=cf317fe7-2188-4721-bc01-124bb5d5dbb2" />

## <img src="https://raw.githubusercontent.com/eginhard/coqui-tts/main/images/coqui-log-green-TTS.png" height="56"/>
## <img src="https://raw.githubusercontent.com/idiap/coqui-ai-TTS/main/images/coqui-log-green-TTS.png" height="56"/>


**🐸TTS is a library for advanced Text-to-Speech generation.**
Expand All @@ -25,14 +26,14 @@ ______________________________________________________________________

[![Discord](https://img.shields.io/discord/1037326658807533628?color=%239B59B6&label=chat%20on%20discord)](https://discord.gg/5eXr5seRrv)
[![License](<https://img.shields.io/badge/License-MPL%202.0-brightgreen.svg>)](https://opensource.org/licenses/MPL-2.0)
[![PyPI version](https://badge.fury.io/py/TTS.svg)](https://badge.fury.io/py/TTS)
[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/eginhard/coqui-tts/blob/main/CODE_OF_CONDUCT.md)
[![Downloads](https://pepy.tech/badge/tts)](https://pepy.tech/project/tts)
[![PyPI version](https://badge.fury.io/py/coqui-tts.svg)](https://badge.fury.io/py/coqui-tts)
[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/idiap/coqui-ai-TTS/blob/main/CODE_OF_CONDUCT.md)
[![Downloads](https://pepy.tech/badge/coqui-tts)](https://pepy.tech/project/coqui-tts)
[![DOI](https://zenodo.org/badge/265612440.svg)](https://zenodo.org/badge/latestdoi/265612440)

![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/tests.yml/badge.svg)
![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/docker.yaml/badge.svg)
![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/style_check.yml/badge.svg)
![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/tests.yml/badge.svg)
![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/docker.yaml/badge.svg)
![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/style_check.yml/badge.svg)
[![Docs](<https://readthedocs.org/projects/coqui-tts/badge/?version=latest&style=plastic>)](https://coqui-tts.readthedocs.io/en/latest/)

</div>
Expand All @@ -49,8 +50,8 @@ Please use our dedicated channels for questions and discussion. Help is much mor
| 👩‍💻 **Usage Questions** | [GitHub Discussions] |
| 🗯 **General Discussion** | [GitHub Discussions] or [Discord] |

[github issue tracker]: https://github.com/eginhard/coqui-tts/issues
[github discussions]: https://github.com/eginhard/coqui-tts/discussions
[github issue tracker]: https://github.com/idiap/coqui-ai-TTS/issues
[github discussions]: https://github.com/idiap/coqui-ai-TTS/discussions
[discord]: https://discord.gg/5eXr5seRrv
[Tutorials and Examples]: https://github.com/coqui-ai/TTS/wiki/TTS-Notebooks-and-Tutorials

Expand All @@ -59,10 +60,10 @@ Please use our dedicated channels for questions and discussion. Help is much mor
| Type | Links |
| ------------------------------- | --------------------------------------- |
| 💼 **Documentation** | [ReadTheDocs](https://coqui-tts.readthedocs.io/en/latest/)
| 💾 **Installation** | [TTS/README.md](https://github.com/eginhard/coqui-tts/tree/dev#installation)|
| 👩‍💻 **Contributing** | [CONTRIBUTING.md](https://github.com/eginhard/coqui-tts/blob/main/CONTRIBUTING.md)|
| 💾 **Installation** | [TTS/README.md](https://github.com/idiap/coqui-ai-TTS/tree/dev#installation)|
| 👩‍💻 **Contributing** | [CONTRIBUTING.md](https://github.com/idiap/coqui-ai-TTS/blob/main/CONTRIBUTING.md)|
| 📌 **Road Map** | [Main Development Plans](https://github.com/coqui-ai/TTS/issues/378)
| 🚀 **Released Models** | [Standard models](https://github.com/eginhard/coqui-tts/blob/dev/TTS/.models.json) and [Fairseq models in ~1100 languages](https://github.com/eginhard/coqui-tts#example-text-to-speech-using-fairseq-models-in-1100-languages-)|
| 🚀 **Released Models** | [Standard models](https://github.com/idiap/coqui-ai-TTS/blob/dev/TTS/.models.json) and [Fairseq models in ~1100 languages](https://github.com/idiap/coqui-ai-TTS#example-text-to-speech-using-fairseq-models-in-1100-languages-)|
| 📰 **Papers** | [TTS Papers](https://github.com/erogol/TTS-papers)|

## Features
Expand Down Expand Up @@ -130,7 +131,7 @@ Please use our dedicated channels for questions and discussion. Help is much mor
You can also help us implement more models.

## Installation
🐸TTS is tested on Ubuntu 18.04 with **python >= 3.9, < 3.12.**.
🐸TTS is tested on Ubuntu 22.04 with **python >= 3.9, < 3.12.**.

If you are only interested in [synthesizing speech](https://coqui-tts.readthedocs.io/en/latest/inference.html) with the released 🐸TTS models, installing from PyPI is the easiest option.

Expand All @@ -141,7 +142,7 @@ pip install coqui-tts
If you plan to code or train models, clone 🐸TTS and install it locally.

```bash
git clone https://github.com/eginhard/coqui-tts
git clone https://github.com/idiap/coqui-ai-TTS
pip install -e .[all,dev,notebooks,server] # Select the relevant extras
```

Expand Down
2 changes: 1 addition & 1 deletion TTS/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.22.1
0.23.0
7 changes: 5 additions & 2 deletions TTS/api.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import logging
import tempfile
import warnings
from pathlib import Path
Expand All @@ -9,6 +10,8 @@
from TTS.utils.manage import ModelManager
from TTS.utils.synthesizer import Synthesizer

logger = logging.getLogger(__name__)


class TTS(nn.Module):
"""TODO: Add voice conversion and Capacitron support."""
Expand Down Expand Up @@ -59,7 +62,7 @@ def __init__(
gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.
"""
super().__init__()
self.manager = ModelManager(models_file=self.get_models_file_path(), progress_bar=progress_bar, verbose=False)
self.manager = ModelManager(models_file=self.get_models_file_path(), progress_bar=progress_bar)
self.config = load_config(config_path) if config_path else None
self.synthesizer = None
self.voice_converter = None
Expand Down Expand Up @@ -122,7 +125,7 @@ def get_models_file_path():

@staticmethod
def list_models():
return ModelManager(models_file=TTS.get_models_file_path(), progress_bar=False, verbose=False).list_models()
return ModelManager(models_file=TTS.get_models_file_path(), progress_bar=False).list_models()

def download_model_by_name(self, model_name: str):
model_path, config_path, model_item = self.manager.download_model(model_name)
Expand Down
4 changes: 4 additions & 0 deletions TTS/bin/compute_attention_masks.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import argparse
import importlib
import logging
import os
from argparse import RawTextHelpFormatter

Expand All @@ -13,9 +14,12 @@
from TTS.tts.models import setup_model
from TTS.tts.utils.text.characters import make_symbols, phonemes, symbols
from TTS.utils.audio import AudioProcessor
from TTS.utils.generic_utils import ConsoleFormatter, setup_logger
from TTS.utils.io import load_checkpoint

if __name__ == "__main__":
setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())

# pylint: disable=bad-option-value
parser = argparse.ArgumentParser(
description="""Extract attention masks from trained Tacotron/Tacotron2 models.
Expand Down
4 changes: 4 additions & 0 deletions TTS/bin/compute_embeddings.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import argparse
import logging
import os
from argparse import RawTextHelpFormatter

Expand All @@ -10,6 +11,7 @@
from TTS.tts.datasets import load_tts_samples
from TTS.tts.utils.managers import save_file
from TTS.tts.utils.speakers import SpeakerManager
from TTS.utils.generic_utils import ConsoleFormatter, setup_logger


def compute_embeddings(
Expand Down Expand Up @@ -100,6 +102,8 @@ def compute_embeddings(


if __name__ == "__main__":
setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())

parser = argparse.ArgumentParser(
description="""Compute embedding vectors for each audio file in a dataset and store them keyed by `{dataset_name}#{file_path}` in a .pth file\n\n"""
"""
Expand Down
4 changes: 4 additions & 0 deletions TTS/bin/compute_statistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

import argparse
import glob
import logging
import os

import numpy as np
Expand All @@ -12,10 +13,13 @@
from TTS.config import load_config
from TTS.tts.datasets import load_tts_samples
from TTS.utils.audio import AudioProcessor
from TTS.utils.generic_utils import ConsoleFormatter, setup_logger


def main():
"""Run preprocessing process."""
setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())

parser = argparse.ArgumentParser(description="Compute mean and variance of spectrogtram features.")
parser.add_argument("config_path", type=str, help="TTS config file path to define audio processin parameters.")
parser.add_argument("out_path", type=str, help="save path (directory and filename).")
Expand Down
4 changes: 4 additions & 0 deletions TTS/bin/eval_encoder.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import argparse
import logging
from argparse import RawTextHelpFormatter

import torch
Expand All @@ -7,6 +8,7 @@
from TTS.config import load_config
from TTS.tts.datasets import load_tts_samples
from TTS.tts.utils.speakers import SpeakerManager
from TTS.utils.generic_utils import ConsoleFormatter, setup_logger


def compute_encoder_accuracy(dataset_items, encoder_manager):
Expand Down Expand Up @@ -51,6 +53,8 @@ def compute_encoder_accuracy(dataset_items, encoder_manager):


if __name__ == "__main__":
setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())

parser = argparse.ArgumentParser(
description="""Compute the accuracy of the encoder.\n\n"""
"""
Expand Down
Loading

0 comments on commit 45abf5a

Please sign in to comment.