0.15
Changes
Besides bug fixes and small UI improvements, I've added the ability to fine-tune (train) a custom XTTS model. It is very simple - just select a file or a folder with multiple audio files, give the model a name and training will be performed fully automatically. The trained model will appear in the "XTTS Model" dropdown in the GUI after clicking on "Connect to server". An Nvidia GPU with at least 8 GB of VRAM is required. As little as 10m of audio is enough to improve voice cloning results vs zero-shot significantly, though I recommend at least 30m. You may experiment with increasing the number of epochs and gradient accumulation layers. When using a custom model, you still have to provide a voice file. You may upload one of the segments produced from the source audio (they are located in Pandrator/easy_xtts_trainer/<model_name>/audio_sources/processed
. Training models requires installing a tool through the launcher (if you have an existing installation, just download the newest launcher executable, put it in the same folder as the Pandrator folder, and install it).
Pre-Installed Packages
You may download self-contained packages that only require unpacking from here. You don't have to install anything, all components are included in portable conda environments. You may install additional components at any time using the launcher.
Installer
You may use the installer/launcher below, which was created from the pandrator_installer_launcher.py
file in the repository, or use the source file directly. Please remember to run the executable as an administrator. It's possible that Windows or your antivirus software will flag it as a threat. You may whitelist it, or, if you're not comfortable doing that, review the code in the repository and install Pandrator manually.