🎬 YouTube Content Analyzer

A powerful, modular tool for automatically extracting, analyzing, and documenting YouTube video content using AI. Perfect for researchers, content creators, educators, and professionals who need to process and analyze video content from any domain at scale.

🌟 Key Features

🎯 Core Functionality

Transcript Extraction: Automatic extraction of video transcripts with timestamps
AI-Powered Analysis: Intelligent content summarization using OpenAI GPT models
Multi-language Support: UI and outputs in English and Spanish (extensible)
Flexible Output Formats: Markdown, JSON, and plain text exports
Batch Processing: Process multiple videos efficiently

🛠️ Advanced Features

Transcript-Only Mode: Extract transcripts without AI analysis for manual processing
Configurable AI Analysis: Toggle AI summarization on/off
Separate Transcript Files: Export clean transcripts for external tools (ChatGPT, Claude, etc.)
Modular Architecture: Clean, maintainable codebase with separate services
Comprehensive Documentation: Structured analysis with key points, tools, concepts, and more

📊 Output Options

Full Analysis Reports: Complete markdown documents with AI insights
Raw Transcripts: Clean text files with optional timestamps
JSON Data Exports: Structured data for further processing
Master Index: Overview of all processed videos with statistics
Multi-format Support: Choose between markdown, JSON, or both

🚀 Quick Start

Prerequisites

Python 3.8 or higher
OpenAI API key (optional, for AI analysis)
Internet connection for YouTube access

Installation

Clone the repository:

git clone https://github.com/your-username/youtube-content-analyzer.git
cd youtube-content-analyzer

Install dependencies:

pip install -r requirements.txt

Configure the tool:

cp config/config_template.json config.json

Edit configuration (add your OpenAI API key):

{
  "openai_api_key": "your-openai-api-key-here",
  "features": {
    "generate_ai_summary": true,
    "export_full_transcript": true,
    "transcript_only_mode": false
  },
  "output": {
    "language": "en",
    "formats": ["markdown", "json"],
    "separate_transcript_file": true
  }
}

Basic Usage

Process a single video with AI analysis:

python main.py "https://www.youtube.com/watch?v=VIDEO_ID"

Extract transcript only (no AI analysis):

python main.py "https://www.youtube.com/watch?v=VIDEO_ID" --transcript-only

Process multiple videos:

python main.py URL1 URL2 URL3 --batch-size 5

Custom configuration:

python main.py URL1 --config my_config.json --language es

📁 Project Structure

youtube-content-analyzer/
├── src/                          # Source code
│   ├── core/                     # Core application logic
│   │   └── youtube_research_tool.py
│   ├── services/                 # Service modules
│   │   ├── transcript_service.py # YouTube transcript extraction
│   │   ├── openai_service.py    # AI analysis service
│   │   └── document_service.py  # Document generation
│   └── utils/                    # Utilities
│       └── i18n.py              # Internationalization
├── config/                       # Configuration files
│   └── config_template.json     # Configuration template
├── locales/                      # Language files
│   ├── en.json                  # English translations
│   └── es.json                  # Spanish translations
├── docs/                         # Documentation
├── examples/                     # Usage examples
├── tests/                        # Unit tests
├── main.py                       # CLI entry point
├── requirements.txt              # Python dependencies
└── README.md                     # This file

⚙️ Configuration Options

Core Settings

{
  "openai_api_key": "your-key-here",
  "openai_model": "gpt-4o-mini",
  "output_directory": "./research_output",
  "languages": ["en", "es"],
  "focus_topics": [
    "Technology", "Programming", "AI & Machine Learning",
    "Business", "Education", "Science"
  ]
}

Feature Toggles

{
  "features": {
    "generate_ai_summary": true,      // Enable/disable AI analysis
    "export_full_transcript": true,   // Include full transcript in output
    "transcript_only_mode": false,    // Only extract transcripts
    "include_timestamp_transcript": true
  }
}

Output Configuration

{
  "output": {
    "language": "en",                 // UI language (en/es)
    "formats": ["markdown", "json"],  // Output formats
    "separate_transcript_file": true  // Create separate .txt files
  }
}

📖 Usage Examples

Example 1: Research Mode (Full Analysis)

python main.py "https://www.youtube.com/watch?v=VIDEO_ID"

Output:

VIDEO_ID_analysis.md - Full analysis with AI insights
VIDEO_ID_transcript.txt - Clean transcript file
VIDEO_ID_data.json - Structured data export
INDEX.md - Master index with summary
results.json - Processing results

Example 2: Transcript-Only Mode

python main.py "https://www.youtube.com/watch?v=VIDEO_ID" --transcript-only

Use cases:

Manual analysis with ChatGPT or Claude
Content preparation for other AI tools
Quick transcript extraction for note-taking
Research without API costs

Example 3: Batch Processing

python main.py URL1 URL2 URL3 URL4 URL5 --batch-size 3

Features:

Processes videos in batches to avoid rate limits
Progress tracking and error handling
Comprehensive reporting across all videos

Example 4: Custom Configuration

python main.py URL1 --config research_config.json --language es --output-dir ./my_research

🌐 Multi-language Support

Supported Languages

English (en): Default language
Spanish (es): Full translation available
Extensible: Easy to add new languages via JSON files

Adding New Languages

Create locales/[language_code].json
Copy structure from locales/en.json
Translate all strings
Update configuration to use new language

🔧 API Integration

OpenAI Integration

The tool uses OpenAI's GPT models for intelligent content analysis:

Default Model: gpt-4o-mini (cost-effective)
Alternative Models: gpt-4, gpt-3.5-turbo
Configurable: Easily switch models in config

Transcript Extraction

Uses youtube-transcript-api for reliable transcript extraction:

Multiple language support
Automatic fallback to available languages
Error handling for unavailable transcripts

📊 Output Formats

Markdown Analysis Report

# 📹 Video Title

**URL:** https://www.youtube.com/watch?v=VIDEO_ID
**Language:** en
**Analysis Date:** 2024-01-15 10:30

## 🎯 Executive Summary
AI-generated summary of the video content...

## 🚀 Key Points
- Main insight 1
- Main insight 2
- Main insight 3

## 🛠️ Tools Mentioned
- Tool 1
- Tool 2

## 📝 Full Transcript
Complete transcript with timestamps...

🔍 Use Cases

📚 Research & Academia

Literature Reviews: Process educational and research videos at scale
Lecture Analysis: Extract insights from academic presentations
Documentation: Create structured notes from any video content

💼 Business & Professional

Market Research: Analyze industry trend videos and competitor content
Training Analysis: Process corporate training and educational materials
Content Strategy: Extract insights from marketing and business videos

🎓 Education & Learning

Course Analysis: Extract key points from online courses and tutorials
Study Materials: Convert video lectures to structured study notes
Knowledge Management: Process educational content for documentation

🔬 General Research

Content Analysis: Analyze videos from any domain or topic
Information Extraction: Pull key insights from interviews and presentations
Cross-Domain Research: Process videos from technology, science, business, arts, etc.

🤝 Contributing

We welcome contributions!

Development Setup

git clone https://github.com/your-username/youtube-content-analyzer.git
cd youtube-content-analyzer
pip install -r requirements.txt

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

YouTube Transcript API: For reliable transcript extraction
OpenAI: For powerful AI analysis capabilities

Made with ❤️ for researchers, educators, and content creators worldwide

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude		.claude
config		config
locales		locales
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup_example.py		setup_example.py

Uh oh!

Uh oh!

nelgonzalez1/-youtube-transcript-analyzer

Folders and files

Latest commit

History

Repository files navigation

🎬 YouTube Content Analyzer

🌟 Key Features

🎯 Core Functionality

🛠️ Advanced Features

📊 Output Options

🚀 Quick Start

Prerequisites

Installation

Basic Usage

📁 Project Structure

⚙️ Configuration Options

Core Settings

Feature Toggles

Output Configuration

📖 Usage Examples

Example 1: Research Mode (Full Analysis)

Example 2: Transcript-Only Mode

Example 3: Batch Processing

Example 4: Custom Configuration

🌐 Multi-language Support

Supported Languages

Adding New Languages

🔧 API Integration

OpenAI Integration

Transcript Extraction

📊 Output Formats

Markdown Analysis Report

🔍 Use Cases

📚 Research & Academia

💼 Business & Professional

🎓 Education & Learning

🔬 General Research

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages