This Python script processes a video file, generates a compelling description, creates a voiceover script in the style of David Attenborough, and synthesizes the voiceover using OpenAI's Text-to-Speech API.
- Extracts frames from a video file
- Generates a video description using GPT-4 Vision
- Creates a voiceover script in David Attenborough's style
- Synthesizes the voiceover using OpenAI's Text-to-Speech API
Before running the script, make sure you have the following installed:
- Python 3.x
- OpenCV (
opencv-python
) - OpenAI Python library
- Requests library
Clone the Repo:
git clone https://github.com/RaheesAhmed/video_description_generator.git
Navigate to the Directory:
cd video_description_generator
You can install the required libraries using pip:
pip install opencv-python openai requests
- Clone this repository or download the script.
- Set up your OpenAI API key as an environment variable in
.env
file :OPENAI_API_KEY='your-api-key-here'
- Place your video file in the
data
directory and name itbison.mp4
, or modify the script to use a different file path. - Run the script:
python main.py
The script will:
- Extract frames from the video
- Generate a video description
- Create a voiceover script
- Synthesize the voiceover audio
The script will print:
- The number of frames extracted from the video
- The generated video description
- The voiceover script
The synthesized audio will be available as a binary object, which can be saved to a file or played using appropriate audio libraries.
This script uses OpenAI's GPT-4 Vision model and Text-to-Speech API, which may incur costs. Make sure you're aware of the pricing and your usage limits.