AI-powered voice automation platform for text-to-speech and automated calling
Powered by ElevenLabs AI • Built for Scale • Developer-Friendly
You have two options to use VoiceForge:
Access the fully deployed application instantly - no setup required.
Visit: https://voiceforge-ai.vercel.app
Perfect for:
- Testing the platform immediately
- Evaluating features before local setup
- Using the service without technical configuration
- Quick demonstrations and prototypes
Set up VoiceForge on your own computer for development, customization, or self-hosting.
Best for:
- Developers wanting to modify or extend the platform
- Organizations requiring self-hosted solutions
- Learning how the system works under the hood
- Contributing to the project
Continue reading below for complete local installation instructions.
Before you begin, ensure you have the following installed:
- Node.js (version 18.0 or higher) - Download here
- npm (comes with Node.js) or yarn package manager
- Git - Download here
Verify your installations:
node --version
npm --version
git --version-
Clone the repository:
git clone https://github.com/Rishabh1925/voiceforge.git cd voiceforge -
Install backend dependencies:
cd backend npm install -
Install frontend dependencies:
cd ../frontend npm install
-
Create environment file:
cd ../backend cp .envbackend .env -
Get your API keys:
-
ElevenLabs API Key:
- Go to ElevenLabs
- Sign up for a free account
- Navigate to Settings > API Keys
- Copy your API key
-
Exotel API (Optional for phone calls):
- Visit Exotel Developer Portal
- Sign up and get your credentials
-
-
Configure your
.envfile: Openbackend/.envin a text editor and add:PORT=5000 # Required: ElevenLabs Configuration ELEVENLABS_API_KEY=sk_your_elevenlabs_api_key_here # Optional: Exotel Configuration (for phone calls) EXOTEL_ACCOUNT_SID=your_exotel_account_sid EXOTEL_API_TOKEN=your_exotel_api_token EXOTEL_PHONE_NUMBER=your_exotel_phone_number
Start both servers:
Open two terminal windows:
Terminal 1 - Backend:
cd backend
npm run devTerminal 2 - Frontend:
cd frontend
npm startOnce both servers are running:
- Frontend Application: Open http://localhost:3000 in your browser
- Backend API: Running on http://localhost:5000
- API Health Check: Visit http://localhost:5000/health
Common Issues:
-
Port already in use:
# Kill processes on specific ports lsof -ti:3000 | xargs kill -9 lsof -ti:5000 | xargs kill -9
-
API Key not working:
- Double-check your ElevenLabs API key
- Ensure no extra spaces in the .env file
- Restart the backend server after changing .env
-
Dependencies not installing:
# Clear npm cache npm cache clean --force # Delete node_modules and reinstall rm -rf node_modules package-lock.json npm install
-
Permission errors on Mac/Linux:
sudo npm install -g npm@latest
VoiceForge is a complete voice communication platform that integrates powerful AI services to deliver:
- Convert text to human-like speech using ElevenLabs' cutting-edge AI
- Make automated phone calls with custom voice messages
- Provide multiple voice personalities for different use cases
- Scale effortlessly with professional-grade API integrations
- Production-ready with proper error handling and security
|
|
|
|
- Enter your text (supports up to 5,000 characters)
- Choose a voice personality from our curated collection
- Adjust settings (speed, pitch, emotion)
- Click "Generate" and get studio-quality audio
- Preview and download your speech file
- Generate your voice message first
- Enter phone number (international format: +1234567890)
- Schedule or call immediately
- Track call status in real-time
- Review call analytics and recordings
For better results:
- Use natural punctuation and pauses
- Break long texts into shorter segments
- Test different voices for your content type
- Use SSML tags for advanced speech control
POST /api/voice/generate
Content-Type: application/json
{
"text": "Hello, this is your AI assistant speaking!",
"voice_id": "pNInz6obpgDQGcFmaJgB",
"model_id": "eleven_monolingual_v1",
"voice_settings": {
"stability": 0.75,
"similarity_boost": 0.75,
"style": 0.5,
"use_speaker_boost": true
}
}View Response Format
{
"success": true,
"data": {
"audioUrl": "/api/audio/speech_1699123456789.mp3",
"fileName": "speech_1699123456789.mp3",
"duration": 3.45,
"wordCount": 8,
"voiceUsed": "Adam - Natural Male",
"generatedAt": "2023-11-04T10:30:45.123Z"
},
"usage": {
"charactersUsed": 43,
"charactersRemaining": 9957
}
}POST /api/call/make-call
Content-Type: application/json
{
"phoneNumber": "+1234567890",
"audioUrl": "/api/audio/speech_1699123456789.mp3",
"callerId": "VoiceAI",
"webhook": "https://your-app.com/webhook/call-status"
}| Endpoint | Method | Description |
|---|---|---|
/api/voice/voices |
GET |
List available voices |
/api/voice/generate |
POST |
Generate speech from text |
/api/voice/stream |
POST |
Stream speech generation |
/api/call/make-call |
POST |
Initiate phone call |
/api/call/status/:id |
GET |
Get call status |
/api/health |
GET |
Health check |
voiceforge/
├── backend/ # Node.js + Express API
│ ├── server.js # Application entry point
│ ├── routes/ # API route handlers
│ │ ├── voice.js # Text-to-speech endpoints
│ │ ├── call.js # Phone call endpoints
│ │ └── utils.js # Utility functions
│ ├── middleware/ # Custom middleware
│ ├── services/ # External API integrations
│ └── uploads/ # Generated audio files
├── frontend/ # React SPA
│ ├── src/
│ │ ├── components/ # Reusable UI components
│ │ ├── pages/ # Application pages
│ │ ├── hooks/ # Custom React hooks
│ │ ├── services/ # API client functions
│ │ └── utils/ # Helper functions
│ └── public/ # Static assets
├── docs/ # Documentation
└── tests/ # Test suites
We love contributions! Here's how you can help make this project even better:
- Fork the repo and clone your fork
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and test thoroughly
- Run tests:
npm run test - Commit:
git commit -m "Add amazing feature" - Push:
git push origin feature/amazing-feature - Create a Pull Request
- UI/UX improvements and design enhancements
- New voice providers and TTS service integrations
- Analytics dashboard for usage insights and metrics
- Internationalization and multi-language support
- Testing coverage and comprehensive test suites
- Documentation improvements with better examples and tutorials
- Check the Documentation - Comprehensive guides and tutorials
- Report Issues - Bug reports and feature requests
- Join Discussions - Community Q&A
This project is licensed under the MIT License - see the LICENSE file for details.
- We don't store your voice data permanently
- API keys are encrypted and securely managed
- All audio files are automatically cleaned up
- GDPR compliant data handling
- ElevenLabs - Revolutionary AI voice technology
- Exotel - Reliable telephony infrastructure
- React Community - Amazing frontend framework
- Node.js Team - Powerful backend runtime
- Voice UI Design Patterns
- Speech Synthesis Markup Language (SSML)
- Web Audio API Documentation