Dataset Maker

Multi-purpose dataset maker for various TTS models.

Tortoise TTS/XTTS
StyleTTS 2 ~ Webui
Higgs Audio ~ Base - My fork
VibeVoice ~ Base - My fork
IndexTTS 2 ~ My Trainer

What does it output?

Tortoise, StyleTTS2, XTTS - Models like these take in a simple text file where audio:text pairs are sorted something like:

path/to/audio/file | transcription

Folder Sturcutre

Dataset_name
- train.txt
-- seg1.wav
-- seg2.wav

Higgs Audio has a main metadata.json that includes all of the information and instructions for how to train on audio files, broken down by .txt files and .wav.

Folder Structure

Dataset_name
- metadata.json
- some_audio_1.txt
- some_audio_1.wav
- some_audio_2.txt
- some_audio_2.wav

Vibe Voice has a main .jsonl file that contains individual json entries with text and audio keys. It always prepends "Speaker 0: " before each transcription in accordance with what the trainer is expecting.

{"text": "Speaker 0: some transcription", "audio": "path/to/audio"}

Folder Structure

Dataset_name
- <project_name>_train.jsonl
- vibevoice_000000.wav
- vibevoice_000001.wav

Installation (Windows)

Make sure you have astral uv installed on your PC

Run the following:

git clone https://github.com/JarodMica/dataset-maker.git
cd dataset-maker
uv sync

uv should handle the installation of all packages and versioning. Once it finishes running, launch the gradio with:
```
uv run .\gradio_interface.py
```

Onnx Runtime Issue CUDA

CUDAExecution provider may not be found even when using uv. The fix for this is to remove and then add optimum[onnxruntime-gpu] in the terminal

Problematic

uv run python
>>> import onnxruntime as ort
>>> print("Available providers:", ort.get_available_providers())
Available providers: ['AzureExecutionProvider', 'CPUExecutionProvider']

Fixed

uv run python
>>> import onnxruntime as ort
>>> print("Available providers:", ort.get_available_providers())
Available providers: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Emilia		Emilia
MDX_Net_Models		MDX_Net_Models
datasets_folder		datasets_folder
emilia_models		emilia_models
logs		logs
ultimatevocalremovergui		ultimatevocalremovergui
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
emilia_pipeline.py		emilia_pipeline.py
gradio_interface.py		gradio_interface.py
infer_uvr.py		infer_uvr.py
llm_reformatter_script.py		llm_reformatter_script.py
pyproject.toml		pyproject.toml
safe_globals.py		safe_globals.py
slicer2.py		slicer2.py
test_call_emi.py		test_call_emi.py
transcriber.py		transcriber.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dataset Maker

What does it output?

Installation (Windows)

Onnx Runtime Issue CUDA

Problematic

Fixed

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

JarodMica/dataset-maker

Folders and files

Latest commit

History

Repository files navigation

Dataset Maker

What does it output?

Installation (Windows)

Onnx Runtime Issue CUDA

Problematic

Fixed

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages