Skip to content

MisileLab/rvc-python-butter

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RVC Python

A Python implementation for using RVC (Retrieval-based Voice Conversion) via console, Python scripts, or API.

Table of Contents

Demo

bandicam.2023-12-27.01-36-03-212.mp4

Features

  • Console interface for single file or batch processing
  • Python module for integration into other projects
  • API server for remote processing
  • Support for both CPU and GPU acceleration
  • Dynamic model loading and unloading
  • Flexible model directory management

Installation

Basic Installation (CPU only)

pip install rvc-python

Recommended Installation (with GPU support)

For Windows:

python -m venv venv
venv\Scripts\activate
pip install rvc-python
pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118

For Linux:

python -m venv venv
source venv/bin/activate
pip install rvc-python
pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118

Usage

Command Line Interface

The CLI supports two modes: cli for direct file processing and api for starting an API server.

python -m rvc_python [-h] {cli,api} ...

CLI Mode

Process a single file or batch of files:

python -m rvc_python cli -i INPUT -o OUTPUT -mp MODEL [options]

Example:

python -m rvc_python cli -i input.wav -o output.wav -mp path/to/model.pth -de cuda:0

API Mode

Start the API server:

python -m rvc_python api [-p PORT] [-l] [options]

Example:

python -m rvc_python api -p 5050 -l

Python Module

from rvc_python.infer import RVCInference

rvc = RVCInference(device="cuda:0")
rvc.load_model("path/to/model.pth")
rvc.infer_file("input.wav", "output.wav")

API

The API server provides several endpoints for voice conversion and model management. Here's a detailed breakdown of each endpoint:

1. Convert Audio

  • Endpoint: POST /convert
  • Description: Converts an audio file using the currently loaded model.
  • Request Body:
    {
      "audio_data": "base64_encoded_audio"
    }
  • Response: The converted audio file (WAV format)
  • Example:
    import requests
    import base64
    
    url = "http://localhost:5050/convert"
    with open("input.wav", "rb") as audio_file:
        audio_data = base64.b64encode(audio_file.read()).decode()
    
    response = requests.post(url, json={"audio_data": audio_data})
    
    with open("output.wav", "wb") as output_file:
        output_file.write(response.content)

2. List Available Models

  • Endpoint: GET /models
  • Description: Returns a list of all available models.
  • Response: JSON array of model names
  • Example:
    response = requests.get("http://localhost:5050/models")
    models = response.json()
    print("Available models:", models)

3. Load a Model

  • Endpoint: POST /models/{model_name}
  • Description: Loads a specific model for use in conversions.
  • Response: Confirmation message
  • Example:
    response = requests.post("http://localhost:5050/models/my_model")
    print(response.json())

4. Get Current Parameters

  • Endpoint: GET /params
  • Description: Retrieves the current parameter settings.
  • Response: JSON object with current parameters
  • Example:
    response = requests.get("http://localhost:5050/params")
    print("Current parameters:", response.json())

5. Set Parameters

  • Endpoint: POST /params
  • Description: Updates the parameters for voice conversion.
  • Request Body:
    {
      "params": {
        "f0method": "harvest",
        "f0up_key": 0,
        "index_rate": 0.5,
        "filter_radius": 3,
        "resample_sr": 0,
        "rms_mix_rate": 0.25,
        "protect": 0.33
      }
    }
  • Response: Confirmation message
  • Example:
    params = {
      "f0method": "harvest",
      "f0up_key": 2,
      "protect": 0.5
    }
    response = requests.post("http://localhost:5050/params", json={"params": params})
    print(response.json())

6. Upload a New Model

  • Endpoint: POST /upload_model
  • Description: Uploads a new model (as a zip file) to the server.
  • Request: Multipart form data with a zip file
  • Response: Confirmation message
  • Example:
    with open("new_model.zip", "rb") as zip_file:
        files = {"file": ("new_model.zip", zip_file)}
        response = requests.post("http://localhost:5050/upload_model", files=files)
    print(response.json())

7. Set Computation Device

  • Endpoint: POST /set_device
  • Description: Sets the device (CPU/GPU) for computations.
  • Request Body:
    {
      "device": "cuda:0"
    }
  • Response: Confirmation message
  • Example:
    response = requests.post("http://localhost:5050/set_device", json={"device": "cuda:0"})
    print(response.json())

Model Management

Models are stored in the rvc_models directory by default. Each model should be in its own subdirectory and contain:

  • A .pth file (required): The main model file.
  • An .index file (optional): For improved voice conversion quality.

Example structure:

rvc_models/
├── model1/
│   ├── model1.pth
│   └── model1.index
└── model2/
    └── model2.pth

You can add new models by:

  1. Manually placing them in the rvc_models directory.
  2. Using the /upload_model API endpoint to upload a zip file containing the model files.
  3. Using the /set_models_dir API endpoint to change the models directory dynamically.

Options

Input/Output Options

  • -i, --input: Input audio file (CLI mode)
  • -d, --dir: Input directory for batch processing (CLI mode)
  • -o, --output: Output file or directory

Model Options

  • -mp, --model: Path to the RVC model file (required for CLI, optional for API)
  • -md, --models_dir: Directory containing RVC models (default: rvc_models in the current directory)
  • -ip, --index: Path to the index file (optional)
  • -v, --version: Model version (v1 or v2)

Processing Options

  • -de, --device: Computation device (e.g., "cpu", "cuda:0")
  • -me, --method: Pitch extraction method (harvest, crepe, rmvpe, pm)
  • -pi, --pitch: Pitch adjustment in semitones
  • -ir, --index_rate: Feature search ratio
  • -fr, --filter_radius: Median filtering radius for pitch
  • -rsr, --resample_sr: Output resampling rate
  • -rmr, --rms_mix_rate: Volume envelope mix rate
  • -pr, --protect: Protection for voiceless consonants

API Server Options

  • -p, --port: API server port (default: 5050)
  • -l, --listen: Allow external connections to API server
  • -pm, --preload-model: Preload a model when starting the API server (optional)

Changelog

For a detailed list of changes and updates, please see the Releases page.

Contributing

Contributions are welcome! Feel free to submit pull requests or open issues for bugs and feature requests.

About

modified rvc-python

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%