Skip to content

AI-powered voice automation platform with text-to-speech and automated calling capabilities. Features 20+ realistic AI voices, real-time audio waveforms, and enterprise-grade phone integration. Built with React, Node.js, ElevenLabs, and Exotel.

License

Notifications You must be signed in to change notification settings

Rishabh1925/voiceforge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoiceForge

React Node.js ElevenLabs

AI-powered voice automation platform for text-to-speech and automated calling

Powered by ElevenLabs AI • Built for Scale • Developer-Friendly

Live Platform Demo Video Report Bug


Getting Started

You have two options to use VoiceForge:

Option 1: Use the Hosted Platform (Recommended for Quick Start)

Access the fully deployed application instantly - no setup required.

Visit: https://voiceforge-ai.vercel.app

Perfect for:

  • Testing the platform immediately
  • Evaluating features before local setup
  • Using the service without technical configuration
  • Quick demonstrations and prototypes

Option 2: Run Locally on Your Machine

Set up VoiceForge on your own computer for development, customization, or self-hosting.

Best for:

  • Developers wanting to modify or extend the platform
  • Organizations requiring self-hosted solutions
  • Learning how the system works under the hood
  • Contributing to the project

Continue reading below for complete local installation instructions.


Local Installation Guide

Prerequisites

Before you begin, ensure you have the following installed:

Verify your installations:

node --version
npm --version
git --version

Step 1: Clone and Setup

  1. Clone the repository:

    git clone https://github.com/Rishabh1925/voiceforge.git
    cd voiceforge
  2. Install backend dependencies:

    cd backend
    npm install
  3. Install frontend dependencies:

    cd ../frontend
    npm install

Step 2: Environment Configuration

  1. Create environment file:

    cd ../backend
    cp .envbackend .env
  2. Get your API keys:

    • ElevenLabs API Key:

      • Go to ElevenLabs
      • Sign up for a free account
      • Navigate to Settings > API Keys
      • Copy your API key
    • Exotel API (Optional for phone calls):

  3. Configure your .env file: Open backend/.env in a text editor and add:

    PORT=5000
    
    # Required: ElevenLabs Configuration
    ELEVENLABS_API_KEY=sk_your_elevenlabs_api_key_here
    
    # Optional: Exotel Configuration (for phone calls)
    EXOTEL_ACCOUNT_SID=your_exotel_account_sid
    EXOTEL_API_TOKEN=your_exotel_api_token
    EXOTEL_PHONE_NUMBER=your_exotel_phone_number

Step 3: Run the Application

Start both servers:

Open two terminal windows:

Terminal 1 - Backend:

cd backend
npm run dev

Terminal 2 - Frontend:

cd frontend
npm start

Step 4: Access Your Local Application

Once both servers are running:

Troubleshooting Local Setup

Common Issues:

  1. Port already in use:

    # Kill processes on specific ports
    lsof -ti:3000 | xargs kill -9
    lsof -ti:5000 | xargs kill -9
  2. API Key not working:

    • Double-check your ElevenLabs API key
    • Ensure no extra spaces in the .env file
    • Restart the backend server after changing .env
  3. Dependencies not installing:

    # Clear npm cache
    npm cache clean --force
    
    # Delete node_modules and reinstall
    rm -rf node_modules package-lock.json
    npm install
  4. Permission errors on Mac/Linux:

    sudo npm install -g npm@latest

What Makes This Special?

VoiceForge is a complete voice communication platform that integrates powerful AI services to deliver:

  • Convert text to human-like speech using ElevenLabs' cutting-edge AI
  • Make automated phone calls with custom voice messages
  • Provide multiple voice personalities for different use cases
  • Scale effortlessly with professional-grade API integrations
  • Production-ready with proper error handling and security

Key Features

Advanced Text-to-Speech

  • 20+ AI-generated voices from ElevenLabs
  • High-fidelity audio output (MP3, 44.1kHz)
  • Real-time generation with streaming support
  • Voice cloning capabilities (premium feature)
  • Emotion and style control

Smart Phone Integration

  • Automated voice calls via Exotel API
  • Call status tracking and analytics
  • International number support
  • Webhook integrations for call events
  • Testing mode for development

Beautiful User Experience

  • Responsive design that works everywhere
  • Real-time audio waveforms
  • Drag-and-drop file uploads
  • Dark/light theme support
  • Progressive Web App (PWA) ready

Enterprise-Ready

  • Secure API key management
  • Rate limiting and usage tracking
  • Audio file optimization
  • Error recovery mechanisms
  • Comprehensive logging

Tech Stack

Frontend Powerhouse

React JavaScript CSS3 Axios

Backend Excellence

Node.js Express ElevenLabs Exotel

Usage Guide

Generate Natural Speech

  1. Enter your text (supports up to 5,000 characters)
  2. Choose a voice personality from our curated collection
  3. Adjust settings (speed, pitch, emotion)
  4. Click "Generate" and get studio-quality audio
  5. Preview and download your speech file

Make AI Phone Calls

  1. Generate your voice message first
  2. Enter phone number (international format: +1234567890)
  3. Schedule or call immediately
  4. Track call status in real-time
  5. Review call analytics and recordings

Pro Tips

For better results:

  • Use natural punctuation and pauses
  • Break long texts into shorter segments
  • Test different voices for your content type
  • Use SSML tags for advanced speech control

API Documentation

Voice Generation

POST /api/voice/generate
Content-Type: application/json

{
  "text": "Hello, this is your AI assistant speaking!",
  "voice_id": "pNInz6obpgDQGcFmaJgB",
  "model_id": "eleven_monolingual_v1",
  "voice_settings": {
    "stability": 0.75,
    "similarity_boost": 0.75,
    "style": 0.5,
    "use_speaker_boost": true
  }
}
View Response Format
{
  "success": true,
  "data": {
    "audioUrl": "/api/audio/speech_1699123456789.mp3",
    "fileName": "speech_1699123456789.mp3",
    "duration": 3.45,
    "wordCount": 8,
    "voiceUsed": "Adam - Natural Male",
    "generatedAt": "2023-11-04T10:30:45.123Z"
  },
  "usage": {
    "charactersUsed": 43,
    "charactersRemaining": 9957
  }
}

Phone Call Integration

POST /api/call/make-call
Content-Type: application/json

{
  "phoneNumber": "+1234567890",
  "audioUrl": "/api/audio/speech_1699123456789.mp3",
  "callerId": "VoiceAI",
  "webhook": "https://your-app.com/webhook/call-status"
}

Available Endpoints

Endpoint Method Description
/api/voice/voices GET List available voices
/api/voice/generate POST Generate speech from text
/api/voice/stream POST Stream speech generation
/api/call/make-call POST Initiate phone call
/api/call/status/:id GET Get call status
/api/health GET Health check

Project Architecture

voiceforge/
├── backend/                      # Node.js + Express API
│   ├── server.js                 # Application entry point
│   ├── routes/                   # API route handlers
│   │   ├── voice.js              # Text-to-speech endpoints
│   │   ├── call.js               # Phone call endpoints  
│   │   └── utils.js              # Utility functions
│   ├── middleware/               # Custom middleware
│   ├── services/                 # External API integrations
│   └── uploads/                  # Generated audio files
├── frontend/                     # React SPA
│   ├── src/
│   │   ├── components/           # Reusable UI components
│   │   ├── pages/                # Application pages
│   │   ├── hooks/                # Custom React hooks
│   │   ├── services/             # API client functions
│   │   └── utils/                # Helper functions
│   └── public/                   # Static assets
├── docs/                         # Documentation
└── tests/                        # Test suites

Contributing

We love contributions! Here's how you can help make this project even better:

Quick Contributing Guide

  1. Fork the repo and clone your fork
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes and test thoroughly
  4. Run tests: npm run test
  5. Commit: git commit -m "Add amazing feature"
  6. Push: git push origin feature/amazing-feature
  7. Create a Pull Request

Areas We'd Love Help With

  • UI/UX improvements and design enhancements
  • New voice providers and TTS service integrations
  • Analytics dashboard for usage insights and metrics
  • Internationalization and multi-language support
  • Testing coverage and comprehensive test suites
  • Documentation improvements with better examples and tutorials

Community & Support

Need Help?

License & Legal

This project is licensed under the MIT License - see the LICENSE file for details.

Privacy & Security

  • We don't store your voice data permanently
  • API keys are encrypted and securely managed
  • All audio files are automatically cleaned up
  • GDPR compliant data handling

Acknowledgments

Special Thanks

  • ElevenLabs - Revolutionary AI voice technology
  • Exotel - Reliable telephony infrastructure
  • React Community - Amazing frontend framework
  • Node.js Team - Powerful backend runtime

Inspiration & Resources

  • Voice UI Design Patterns
  • Speech Synthesis Markup Language (SSML)
  • Web Audio API Documentation

Built by Rishabh Ranjan Singh

Making voice technology accessible to everyone

GitHub LinkedIn

About

AI-powered voice automation platform with text-to-speech and automated calling capabilities. Features 20+ realistic AI voices, real-time audio waveforms, and enterprise-grade phone integration. Built with React, Node.js, ElevenLabs, and Exotel.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published