[WIP] VoiceSmith makes training text to speech models easy.
-
Updated
Oct 10, 2022 - Python
[WIP] VoiceSmith makes training text to speech models easy.
Python library for handling audio datasets.
Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
An open source tool for large-scale EEG datasets processing
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
A tool for downloading from public image boards (which allow scraping) / preview your images & tags / edit your images & tags. Additional tabs for downloading other desired code repositories as well as S.O.T.A. diffusion and auto-tag/caption models for your purposes. Custom datasets can be added!
一款强大的大模型微调数据集生成和管理工具。
Access to data for workshops and extended tests of MDAnalysis.
A single library to (down)load all existing sign language handshape datasets.
Data preparation code for building Kaldi ASR system
Scripts to automatize and standardize dataset handling
Machine learning library for classification tasks
A tool to download and format PASCAL VOC 2007 dataset for multilabel classification
A single library to (down)load all existing sign language video datasets.
Extraction tool to parse MS Celeb dataset
Plugin for the dataset module containing information access related datasets
A tool to download and format MS COCO dataset for multilabel classification
A tool to download and format NUS-WIDE dataset for multilabel classification
Utility for constructing highly efficient in-memory / on-disk datasets.
Tool for managing datasets of images with compositional semantics, part of VisSE project.
Add a description, image, and links to the dataset-manager topic page so that developers can more easily learn about it.
To associate your repository with the dataset-manager topic, visit your repo's landing page and select "manage topics."