Skip to content

JarodMica/dataset-maker

Repository files navigation

Dataset Maker

Multi-purpose dataset maker for various TTS models.

What does it output?

Tortoise, StyleTTS2, XTTS - Models like these take in a simple text file where audio:text pairs are sorted something like:

  • path/to/audio/file | transcription

Folder Sturcutre

Dataset_name
- train.txt
-- seg1.wav
-- seg2.wav

Higgs Audio has a main metadata.json that includes all of the information and instructions for how to train on audio files, broken down by .txt files and .wav.

Folder Structure

Dataset_name
- metadata.json
- some_audio_1.txt
- some_audio_1.wav
- some_audio_2.txt
- some_audio_2.wav

Vibe Voice has a main .jsonl file that contains individual json entries with text and audio keys. It always prepends "Speaker 0: " before each transcription in accordance with what the trainer is expecting.

  • {"text": "Speaker 0: some transcription", "audio": "path/to/audio"}

Folder Structure

Dataset_name
- <project_name>_train.jsonl
- vibevoice_000000.wav
- vibevoice_000001.wav

Installation (Windows)

  1. Make sure you have astral uv installed on your PC
  2. Run the following:
    git clone https://github.com/JarodMica/dataset-maker.git
    cd dataset-maker
    uv sync
  3. uv should handle the installation of all packages and versioning. Once it finishes running, launch the gradio with:
    uv run .\gradio_interface.py
    

Onnx Runtime Issue CUDA

CUDAExecution provider may not be found even when using uv. The fix for this is to remove and then add optimum[onnxruntime-gpu] in the terminal

Problematic

uv run python
>>> import onnxruntime as ort
>>> print("Available providers:", ort.get_available_providers())
Available providers: ['AzureExecutionProvider', 'CPUExecutionProvider']

Fixed

uv run python
>>> import onnxruntime as ort
>>> print("Available providers:", ort.get_available_providers())
Available providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published