Skip to content

Latest commit

 

History

History
184 lines (123 loc) · 4.86 KB

README.md

File metadata and controls

184 lines (123 loc) · 4.86 KB

VoiceReactify [Building]

VoiceReactify is a React-based application that integrates audio processing with Text-to-Speech (TTS) functionalities. This project allows users to record audio, transcribe it into text using AI models, and optionally convert the text back into speech. It also supports voice assignment based on Microsoft's Speech API.

intro

Features

  • Audio Recording: Capture audio directly from the browser.
  • Real-time Transcription: Convert audio to text using advanced AI models.
  • Text-to-Speech: Convert transcribed text back to speech using TTS.
  • React Integration: Built using the React framework for a seamless user experience.
  • Voice Assignment: Use Microsoft's API to customize voice characteristics.

Table of Contents

Getting Started

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Node.js: Install from here.
  • pnpm: Comes with Node.js.
  • Python: Required for backend processing. Install from here.
  • Docker (Optional): For containerization. Install from here.

Installation

Follow these steps to set up the project locally:

  1. Clone the Repository

    git clone https://github.com/yourusername/VoiceReactify.git
    cd VoiceReactify
  2. Install Frontend Dependencies

    cd frontend
    pnpm install
  3. Install Backend Dependencies

    cd ../backend
    pip install -r requirements.txt

Configuration

  1. Backend Environment Variables

    Create a env file in the backend directory:

    AZURE_SPEECH_KEY = 'xxx'  
    AZURE_SPEECH_REGION = 'xxx'   
    

Docker Setup (Optional)

If you prefer using Docker:

  1. Modify Terms of Service Agreement

    Due to licensing requirements, you need to modify a backend file to automatically agree to the terms of service when running in Docker. See the Important Notice section below.

  2. Build Docker Images

    docker-compose build
  3. Run Docker Containers

    docker-compose up

Usage

  1. Start the Backend Server

    Make backend # in root
    --- or
    cd backend
    python app.py
  2. Start the Frontend Server

    Open a new terminal window:

    Make frontend  # in root
    --- or
    cd frontend
    pnpm run dev
  3. Access the Application

    Navigate to http://localhost:5173 in your web browser.

  4. Record and Transcribe Audio

    • Click on the Record button to start recording.
    • Click Stop to finish recording.
    • The audio will be transcribed into text automatically.
  5. Convert Text to Speech

    • Choose a voice using the Voice Assignment feature.
    • Click on Play to hear the synthesized speech.(It is a icon)

Important Notice

Also, your computer needs to have enough memory to run this app.

When running the application in Docker, you need to modify the backend code to automatically agree to the terms of service for certain AI models. This is necessary for downloading and using specific models within a containerized environment.

Modify the ask_tos Method

In the backend code, locate the ask_tos method and replace it with the following:

@staticmethod
def ask_tos(model_full_path):
    """Automatically agree to the terms of service"""
    tos_path = os.path.join(model_full_path, "tos_agreed.txt")
    print(" > Automatically agreeing to the terms of service:")
    print(' | > "I have purchased a commercial license from Coqui: licensing@coqui.ai"')
    print(' | > "Otherwise, I agree to the terms of the non-commercial CPML: https://coqui.ai/cpml"')
    # Automatically agree to the license agreement
    with open(tos_path, "w", encoding="utf-8") as f:
        f.write("I have read, understood and agreed to the Terms and Conditions.")
    return True

Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the Repository

  2. Create a Feature Branch

    git checkout -b feature/YourFeature
  3. Commit Your Changes

    git commit -m 'Add YourFeature'
  4. Push to the Branch

    git push origin feature/YourFeature
  5. Open a Pull Request

License

This project is licensed under the MIT License. See the LICENSE file for details.