VoiceForge

AI-powered voice automation platform for text-to-speech and automated calling

Powered by ElevenLabs AI • Built for Scale • Developer-Friendly

Getting Started

You have two options to use VoiceForge:

Option 1: Use the Hosted Platform (Recommended for Quick Start)

Access the fully deployed application instantly - no setup required.

Visit: https://voiceforge-ai.vercel.app

Perfect for:

Testing the platform immediately
Evaluating features before local setup
Using the service without technical configuration
Quick demonstrations and prototypes

Option 2: Run Locally on Your Machine

Set up VoiceForge on your own computer for development, customization, or self-hosting.

Best for:

Developers wanting to modify or extend the platform
Organizations requiring self-hosted solutions
Learning how the system works under the hood
Contributing to the project

Continue reading below for complete local installation instructions.

Local Installation Guide

Prerequisites

Before you begin, ensure you have the following installed:

Node.js (version 18.0 or higher) - Download here
npm (comes with Node.js) or yarn package manager
Git - Download here

Verify your installations:

node --version
npm --version
git --version

Step 1: Clone and Setup

Clone the repository:

git clone https://github.com/Rishabh1925/voiceforge.git
cd voiceforge

Install backend dependencies:
```
cd backend
npm install
```
Install frontend dependencies:
```
cd ../frontend
npm install
```

Step 2: Environment Configuration

Create environment file:
```
cd ../backend
cp .envbackend .env
```
Get your API keys:
- ElevenLabs API Key:
  - Go to ElevenLabs
  - Sign up for a free account
  - Navigate to Settings > API Keys
  - Copy your API key
- Exotel API (Optional for phone calls):
  - Visit Exotel Developer Portal
  - Sign up and get your credentials

Configure your .env file: Open backend/.env in a text editor and add:

PORT=5000

# Required: ElevenLabs Configuration
ELEVENLABS_API_KEY=sk_your_elevenlabs_api_key_here

# Optional: Exotel Configuration (for phone calls)
EXOTEL_ACCOUNT_SID=your_exotel_account_sid
EXOTEL_API_TOKEN=your_exotel_api_token
EXOTEL_PHONE_NUMBER=your_exotel_phone_number

Step 3: Run the Application

Start both servers:

Open two terminal windows:

Terminal 1 - Backend:

cd backend
npm run dev

Terminal 2 - Frontend:

cd frontend
npm start

Step 4: Access Your Local Application

Once both servers are running:

Frontend Application: Open http://localhost:3000 in your browser
Backend API: Running on http://localhost:5000
API Health Check: Visit http://localhost:5000/health

Troubleshooting Local Setup

Common Issues:

Port already in use:

# Kill processes on specific ports
lsof -ti:3000 | xargs kill -9
lsof -ti:5000 | xargs kill -9

API Key not working:
- Double-check your ElevenLabs API key
- Ensure no extra spaces in the .env file
- Restart the backend server after changing .env

Dependencies not installing:

# Clear npm cache
npm cache clean --force

# Delete node_modules and reinstall
rm -rf node_modules package-lock.json
npm install

Permission errors on Mac/Linux:
```
sudo npm install -g npm@latest
```

What Makes This Special?

VoiceForge is a complete voice communication platform that integrates powerful AI services to deliver:

Convert text to human-like speech using ElevenLabs' cutting-edge AI
Make automated phone calls with custom voice messages
Provide multiple voice personalities for different use cases
Scale effortlessly with professional-grade API integrations
Production-ready with proper error handling and security

Key Features

Advanced Text-to-Speech 20+ AI-generated voices from ElevenLabs High-fidelity audio output (MP3, 44.1kHz) Real-time generation with streaming support Voice cloning capabilities (premium feature) Emotion and style control	Smart Phone Integration Automated voice calls via Exotel API Call status tracking and analytics International number support Webhook integrations for call events Testing mode for development
Beautiful User Experience Responsive design that works everywhere Real-time audio waveforms Drag-and-drop file uploads Dark/light theme support Progressive Web App (PWA) ready	Enterprise-Ready Secure API key management Rate limiting and usage tracking Audio file optimization Error recovery mechanisms Comprehensive logging

Tech Stack

Frontend Powerhouse

Backend Excellence

Usage Guide

Generate Natural Speech

Enter your text (supports up to 5,000 characters)
Choose a voice personality from our curated collection
Adjust settings (speed, pitch, emotion)
Click "Generate" and get studio-quality audio
Preview and download your speech file

Make AI Phone Calls

Generate your voice message first
Enter phone number (international format: +1234567890)
Schedule or call immediately
Track call status in real-time
Review call analytics and recordings

Pro Tips

For better results:

Use natural punctuation and pauses
Break long texts into shorter segments
Test different voices for your content type
Use SSML tags for advanced speech control

API Documentation

Voice Generation

POST /api/voice/generate
Content-Type: application/json

{
  "text": "Hello, this is your AI assistant speaking!",
  "voice_id": "pNInz6obpgDQGcFmaJgB",
  "model_id": "eleven_monolingual_v1",
  "voice_settings": {
    "stability": 0.75,
    "similarity_boost": 0.75,
    "style": 0.5,
    "use_speaker_boost": true
  }
}

View Response Format

{
  "success": true,
  "data": {
    "audioUrl": "/api/audio/speech_1699123456789.mp3",
    "fileName": "speech_1699123456789.mp3",
    "duration": 3.45,
    "wordCount": 8,
    "voiceUsed": "Adam - Natural Male",
    "generatedAt": "2023-11-04T10:30:45.123Z"
  },
  "usage": {
    "charactersUsed": 43,
    "charactersRemaining": 9957
  }
}

Phone Call Integration

POST /api/call/make-call
Content-Type: application/json

{
  "phoneNumber": "+1234567890",
  "audioUrl": "/api/audio/speech_1699123456789.mp3",
  "callerId": "VoiceAI",
  "webhook": "https://your-app.com/webhook/call-status"
}

Available Endpoints

Endpoint	Method	Description
`/api/voice/voices`	`GET`	List available voices
`/api/voice/generate`	`POST`	Generate speech from text
`/api/voice/stream`	`POST`	Stream speech generation
`/api/call/make-call`	`POST`	Initiate phone call
`/api/call/status/:id`	`GET`	Get call status
`/api/health`	`GET`	Health check

Project Architecture

voiceforge/
├── backend/                      # Node.js + Express API
│   ├── server.js                 # Application entry point
│   ├── routes/                   # API route handlers
│   │   ├── voice.js              # Text-to-speech endpoints
│   │   ├── call.js               # Phone call endpoints  
│   │   └── utils.js              # Utility functions
│   ├── middleware/               # Custom middleware
│   ├── services/                 # External API integrations
│   └── uploads/                  # Generated audio files
├── frontend/                     # React SPA
│   ├── src/
│   │   ├── components/           # Reusable UI components
│   │   ├── pages/                # Application pages
│   │   ├── hooks/                # Custom React hooks
│   │   ├── services/             # API client functions
│   │   └── utils/                # Helper functions
│   └── public/                   # Static assets
├── docs/                         # Documentation
└── tests/                        # Test suites

Contributing

We love contributions! Here's how you can help make this project even better:

Quick Contributing Guide

Fork the repo and clone your fork
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and test thoroughly
Run tests: npm run test
Commit: git commit -m "Add amazing feature"
Push: git push origin feature/amazing-feature
Create a Pull Request

Areas We'd Love Help With

UI/UX improvements and design enhancements
New voice providers and TTS service integrations
Analytics dashboard for usage insights and metrics
Internationalization and multi-language support
Testing coverage and comprehensive test suites
Documentation improvements with better examples and tutorials

Community & Support

Need Help?

Check the Documentation - Comprehensive guides and tutorials
Report Issues - Bug reports and feature requests
Join Discussions - Community Q&A

License & Legal

This project is licensed under the MIT License - see the LICENSE file for details.

Privacy & Security

We don't store your voice data permanently
API keys are encrypted and securely managed
All audio files are automatically cleaned up
GDPR compliant data handling

Acknowledgments

Special Thanks

ElevenLabs - Revolutionary AI voice technology
Exotel - Reliable telephony infrastructure
React Community - Amazing frontend framework
Node.js Team - Powerful backend runtime

Inspiration & Resources

Voice UI Design Patterns
Speech Synthesis Markup Language (SSML)
Web Audio API Documentation

Built by Rishabh Ranjan Singh

Making voice technology accessible to everyone

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
vercel.json		vercel.json

License

Rishabh1925/voiceforge

Folders and files

Latest commit

History

Repository files navigation