A powerful, modular tool for automatically extracting, analyzing, and documenting YouTube video content using AI. Perfect for researchers, content creators, educators, and professionals who need to process and analyze video content from any domain at scale.
- Transcript Extraction: Automatic extraction of video transcripts with timestamps
- AI-Powered Analysis: Intelligent content summarization using OpenAI GPT models
- Multi-language Support: UI and outputs in English and Spanish (extensible)
- Flexible Output Formats: Markdown, JSON, and plain text exports
- Batch Processing: Process multiple videos efficiently
- Transcript-Only Mode: Extract transcripts without AI analysis for manual processing
- Configurable AI Analysis: Toggle AI summarization on/off
- Separate Transcript Files: Export clean transcripts for external tools (ChatGPT, Claude, etc.)
- Modular Architecture: Clean, maintainable codebase with separate services
- Comprehensive Documentation: Structured analysis with key points, tools, concepts, and more
- Full Analysis Reports: Complete markdown documents with AI insights
- Raw Transcripts: Clean text files with optional timestamps
- JSON Data Exports: Structured data for further processing
- Master Index: Overview of all processed videos with statistics
- Multi-format Support: Choose between markdown, JSON, or both
- Python 3.8 or higher
- OpenAI API key (optional, for AI analysis)
- Internet connection for YouTube access
- Clone the repository:
git clone https://github.com/your-username/youtube-content-analyzer.git
cd youtube-content-analyzer- Install dependencies:
pip install -r requirements.txt- Configure the tool:
cp config/config_template.json config.json- Edit configuration (add your OpenAI API key):
{
"openai_api_key": "your-openai-api-key-here",
"features": {
"generate_ai_summary": true,
"export_full_transcript": true,
"transcript_only_mode": false
},
"output": {
"language": "en",
"formats": ["markdown", "json"],
"separate_transcript_file": true
}
}Process a single video with AI analysis:
python main.py "https://www.youtube.com/watch?v=VIDEO_ID"Extract transcript only (no AI analysis):
python main.py "https://www.youtube.com/watch?v=VIDEO_ID" --transcript-onlyProcess multiple videos:
python main.py URL1 URL2 URL3 --batch-size 5Custom configuration:
python main.py URL1 --config my_config.json --language esyoutube-content-analyzer/
βββ src/ # Source code
β βββ core/ # Core application logic
β β βββ youtube_research_tool.py
β βββ services/ # Service modules
β β βββ transcript_service.py # YouTube transcript extraction
β β βββ openai_service.py # AI analysis service
β β βββ document_service.py # Document generation
β βββ utils/ # Utilities
β βββ i18n.py # Internationalization
βββ config/ # Configuration files
β βββ config_template.json # Configuration template
βββ locales/ # Language files
β βββ en.json # English translations
β βββ es.json # Spanish translations
βββ docs/ # Documentation
βββ examples/ # Usage examples
βββ tests/ # Unit tests
βββ main.py # CLI entry point
βββ requirements.txt # Python dependencies
βββ README.md # This file
{
"openai_api_key": "your-key-here",
"openai_model": "gpt-4o-mini",
"output_directory": "./research_output",
"languages": ["en", "es"],
"focus_topics": [
"Technology", "Programming", "AI & Machine Learning",
"Business", "Education", "Science"
]
}{
"features": {
"generate_ai_summary": true, // Enable/disable AI analysis
"export_full_transcript": true, // Include full transcript in output
"transcript_only_mode": false, // Only extract transcripts
"include_timestamp_transcript": true
}
}{
"output": {
"language": "en", // UI language (en/es)
"formats": ["markdown", "json"], // Output formats
"separate_transcript_file": true // Create separate .txt files
}
}python main.py "https://www.youtube.com/watch?v=VIDEO_ID"Output:
VIDEO_ID_analysis.md- Full analysis with AI insightsVIDEO_ID_transcript.txt- Clean transcript fileVIDEO_ID_data.json- Structured data exportINDEX.md- Master index with summaryresults.json- Processing results
python main.py "https://www.youtube.com/watch?v=VIDEO_ID" --transcript-onlyUse cases:
- Manual analysis with ChatGPT or Claude
- Content preparation for other AI tools
- Quick transcript extraction for note-taking
- Research without API costs
python main.py URL1 URL2 URL3 URL4 URL5 --batch-size 3Features:
- Processes videos in batches to avoid rate limits
- Progress tracking and error handling
- Comprehensive reporting across all videos
python main.py URL1 --config research_config.json --language es --output-dir ./my_research- English (en): Default language
- Spanish (es): Full translation available
- Extensible: Easy to add new languages via JSON files
- Create
locales/[language_code].json - Copy structure from
locales/en.json - Translate all strings
- Update configuration to use new language
The tool uses OpenAI's GPT models for intelligent content analysis:
- Default Model:
gpt-4o-mini(cost-effective) - Alternative Models:
gpt-4,gpt-3.5-turbo - Configurable: Easily switch models in config
Uses youtube-transcript-api for reliable transcript extraction:
- Multiple language support
- Automatic fallback to available languages
- Error handling for unavailable transcripts
# πΉ Video Title
**URL:** https://www.youtube.com/watch?v=VIDEO_ID
**Language:** en
**Analysis Date:** 2024-01-15 10:30
## π― Executive Summary
AI-generated summary of the video content...
## π Key Points
- Main insight 1
- Main insight 2
- Main insight 3
## π οΈ Tools Mentioned
- Tool 1
- Tool 2
## π Full Transcript
Complete transcript with timestamps...- Literature Reviews: Process educational and research videos at scale
- Lecture Analysis: Extract insights from academic presentations
- Documentation: Create structured notes from any video content
- Market Research: Analyze industry trend videos and competitor content
- Training Analysis: Process corporate training and educational materials
- Content Strategy: Extract insights from marketing and business videos
- Course Analysis: Extract key points from online courses and tutorials
- Study Materials: Convert video lectures to structured study notes
- Knowledge Management: Process educational content for documentation
- Content Analysis: Analyze videos from any domain or topic
- Information Extraction: Pull key insights from interviews and presentations
- Cross-Domain Research: Process videos from technology, science, business, arts, etc.
We welcome contributions!
git clone https://github.com/your-username/youtube-content-analyzer.git
cd youtube-content-analyzer
pip install -r requirements.txtThis project is licensed under the MIT License.
- YouTube Transcript API: For reliable transcript extraction
- OpenAI: For powerful AI analysis capabilities
Made with β€οΈ for researchers, educators, and content creators worldwide