AI Video Narrator by RapidAI An advanced web application that automatically generates professional narrations for videos using artificial intelligence.
Try it here:
https://ai-video-narrator-by-rapidai.replit.app/
https://c73e736d-0664-4646-ae74-68c1e6afe604-00-1dfpte62mystr.spock.replit.dev/
An advanced web application that automatically generates professional narration for videos using GPT-4V vision analysis and text-to-speech conversion. The system provides a seamless experience for video processing, custom narration editing, and final video generation.
- Intelligent Video Analysis: Utilizes GPT-4V for advanced content analysis and narration generation
- Multiple Voice Options: Choose from six different voice options:
- Alloy (Neutral)
- Echo (Warm)
- Fable (British)
- Onyx (Professional)
- Nova (Friendly)
- Shimmer (Cheerful)
- Efficient Processing:
- Chunked video processing for handling longer videos
- Optimized frame extraction
- Memory management and rate limiting
- Real-time Preview: Live video preview with synchronized audio
- Custom Script Editing: Edit and update narration scripts with real-time preview
- Secure File Handling: Verified upload/download operations with proper error management
- Clone the repository
- Set up environment variables:
OPENAI_API_KEY=your_openai_api_key
- Install dependencies:
pip install flask flask-sqlalchemy opencv-python openai moviepy requests
- Initialize the database:
python >>> from app import db >>> db.create_all()
- Start the server:
python main.py
- Access the web interface at
http://localhost:5000
- Upload a video (supported formats: MP4, AVI, MOV, WMV)
- Select your preferred voice
- Click "Generate Narration"
- Edit the generated script if needed
- Preview and download the final narrated video
Upload a video for narration generation.
- Body: multipart/form-data
- video: Video file (required)
- voice: Voice option (default: "onyx")
- Returns: JSON with script and output path
Update the narration script for a processed video.
- Body: JSON
- script: New narration text
- video_id: ID of the processed video
- Returns: JSON with updated script and output path
Download the processed video.
- Parameters:
- filename: Name of the processed video file
- Returns: Video file download
-
Frame Extraction
- Optimized frame selection using intervals
- Memory-efficient processing with chunking
- Quality optimization for large videos
-
Content Analysis
- GPT-4V vision analysis for frame understanding
- Contextual narration generation
- Rate limit handling with exponential backoff
-
Audio Generation
- Text-to-speech conversion using OpenAI's TTS API
- Multiple voice options
- Audio-video synchronization
-
File Management
- Secure file handling
- Automatic cleanup of temporary files
- Progress tracking and status updates
CREATE TABLE video (
id INTEGER PRIMARY KEY,
filename VARCHAR(255) NOT NULL,
original_filename VARCHAR(255) NOT NULL,
status VARCHAR(50) DEFAULT 'processing',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
narration_text TEXT,
output_path VARCHAR(255)
);
- Flask: Web framework
- Flask-SQLAlchemy: Database ORM
- OpenCV: Video processing
- OpenAI: GPT-4V and TTS API
- MoviePy: Video editing
- Bootstrap: UI framework
The application implements comprehensive error handling:
- File validation
- API rate limiting
- Processing failures
- Database operations
- File system operations
Required environment variables:
OPENAI_API_KEY
: Your OpenAI API key for GPT-4V and TTS services
- Secure file upload handling
- Path traversal prevention
- File type validation
- Size limitations (100MB max)
- Proper error messages
RapidAi
AI Video Narrator by RapidAI An advanced web application that automatically generates professional narrations for videos using artificial intelligence. The system combines GPT-4V's visual understanding capabilities with OpenAI's text-to-speech technology to create engaging video narrations.
Features Automated video content analysis using GPT-4V Professional narration generation in multiple voices Support for various video formats (MP4, AVI, MOV, WMV) Custom narration script editing Real-time video preview with synchronized audio Multiple voice options (Alloy, Echo, Fable, Onyx, Nova, Shimmer) Efficient processing of longer videos through chunking Secure file handling with size limits and format validation Technical Features Optimized frame extraction and video processing Rate limiting and memory management Comprehensive error handling Secure file operations Database-backed video processing queue Client-side file validation Requirements Python 3.8+ OpenAI API key FFmpeg for video processing Usage Upload a video (supported formats: MP4, AVI, MOV, WMV) Select preferred narration voice Wait for automated processing Review and edit generated narration if desired Preview the narrated video Download the final video Limitations Maximum file size: 100MB Processing time depends on video length Requires stable internet connection API rate limits apply