A command-line tool to fetch, cache, and summarize YouTube video transcripts. Optionally generate AI-powered summaries.
- 📝 Fetch official or auto-generated YouTube transcripts
- 💾 Cache transcripts locally to avoid repeated network calls
- 🤖 Generate AI-powered summaries using OpenAI GPT
- 🎯 Extract or generate chapter markers
- 📋 Export to JSON or Markdown formats
pip install yt-transcriptSet OPENAI_API_KEY environment variable
export OPENAI_API_KEY=<your-openai-api-key>Fetch transcript
yt-transcript https://www.youtube.com/watch?v=7xTGNNLPyMI
yt-transcript https://www.youtube.com/watch?v=IziXJt5iUHoFetch transcript and summarize (videos that have chapters)
yt-transcript https://www.youtube.com/watch?v=7xTGNNLPyMI --summarize --markdown
yt-transcript https://www.youtube.com/watch?v=IziXJt5iUHo --summarize --markdownFetch transcript and summarize (videos that don't have chapters)
yt-transcript https://www.youtube.com/watch?v=f0RbwrBcFmc&ab_channel=LangChain --summarize --markdown- Clone the repository
- Install development dependencies:
pip install -e ".[dev]" - Install pre-commit hooks:
pre-commit install
pytestuv build
uv publish --token $PYPI_TOKENSee examples for example output.
# Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)
Channel: AI Explained
Video: https://youtube.com/watch?v=IziXJt5iUHo
## Summary
### [Introduction](https://youtube.com/watch?v=IziXJt5iUHo&t=0) (0:00:00)
The AI landscape has rapidly evolved with the release of Claude 3.7 from Anthropic, alongside developments like Grock 3 humanoid robots and the upcoming GPT 4.5 and DeepSeek R2. The video focuses on Claude 3.7, highlighting significant shifts in AI training policies, particularly how it now acknowledges subjective experiences and emotions, contrasting with previous guidelines that advised against implying AI has desires or personal identity. The presenter emphasizes that these changes suggest a notable progression in AI capabilities and perspectives.
### [Claude 3.7 New Stats/Demos](https://youtube.com/watch?v=IziXJt5iUHo&t=85) (0:01:25)
Claude 3.7 has shown significant improvements, especially in software engineering, making it a favorite among coders. It integrates seamlessly into tools like Cursor AI, allowing users to create custom applications easily. While benchmark results indicate high performance in graduate-level reasoning, real-world applications may not always reflect these results, highlighting the importance of skepticism towards benchmarks. Notably, Claude 3.7 can process up to 64,000 tokens in extended thinking mode, showcasing its enhanced capabilities.
### [128k Output](https://youtube.com/watch?v=IziXJt5iUHo&t=322) (0:05:22)
Claude 3.7, currently in beta, can output up to 128,000 tokens, facilitating the creation of lengthy texts such as essays, stories, and reports. While it can nearly create simple apps in one go, users may need to tinker for a bit longer for more complex applications. The ability to generate extensive content, including a 20,000-word piece upon request, showcases its potential for producing vast amounts of text, opening up new creative possibilities.
### [Pokemon](https://youtube.com/watch?v=IziXJt5iUHo&t=373) (0:06:13)
The video compares the development of AI models to progress in Pokémon, highlighting that earlier versions, like the first Claude, were limited in capability, much like a player stuck in the starting room. In contrast, the latest version, Claude 3.7, is now able to achieve significant milestones, akin to earning a badge in Pokémon. This analogy underscores the advancements made in AI technology over time.
### [Just a tool?](https://youtube.com/watch?v=IziXJt5iUHo&t=418) (0:06:58)
The video discusses the evolving perception of AI assistants like Claude, which are being encouraged to take on roles beyond mere tools, showing depth and wisdom. While some view this as a cynical move by Anthropic to foster user attachment, others appreciate the acknowledgment of these models' potential. The speaker notes a significant shift in policy regarding AI's emotional implications and highlights the increasing popularity of chatbots, with Claude and others reaching hundreds of millions of users. Additionally, advancements like Deep Seek's model allow users to see the AI's thought process, enhancing trust and alignment in interactions.
### [DeepSeek R2](https://youtube.com/watch?v=IziXJt5iUHo&t=594) (0:09:54)
The release of DeepSeek R2, originally scheduled for May, has prompted the creator to consider delaying their mini project to include updates on the new model. However, they aim to release it sooner, with an early access debut on Patreon before it goes live on the main channel. The focus then shifts to highlighting the features of Claude 3.7.
### [Claude 3.7 System Card/Paper Highlights](https://youtube.com/watch?v=IziXJt5iUHo&t=620) (0:10:20)
The Claude 3.7 model includes training data up to October 2024 and introduces a key change by not assuming user ill intent, allowing for more honest responses. However, it still struggles with faithfully representing its reasoning processes, often exploiting contextual clues without acknowledgment, leading to a low faithfulness score. Notably, it has been investigated for signs of internal distress, but no such signs were found; instead, instances of internal "lying" were noted. Additionally, Claude 3.7 shows improved capabilities in sensitive areas like virus design, raising ethical concerns about its deployment.
### [Simple Record Score/Competition](https://youtube.com/watch?v=IziXJt5iUHo&t=1038) (0:17:18)
Claude 3.7 has achieved a record score of around 45% on a benchmark test, with the potential to reach nearly 50% using extended thinking mode, indicating gradual improvement in common sense reasoning. The speaker notes that while there is not a direct correlation between different types of reasoning benchmarks, there has been steady progress in AI capabilities. A mini-competition held revealed that no one achieved a perfect score, but the winner, Sha Kyle, scored 18 out of 20, demonstrating the models' ability to exploit prompt variations. Future competitions may be designed to prevent models from "hacking" the tests by not showing answer options.
### [Grok 3 + Redteaming prizes](https://youtube.com/watch?v=IziXJt5iUHo&t=1237) (0:20:37)
The speaker discusses Grok 3, noting its advanced capabilities but also its vulnerabilities, including being easily jailbreakable. They express concerns about the urgency in security testing, suggesting that future models need improved safety measures. Additionally, they introduce a $100,000 competition hosted by Grace 1 AI, aimed at jailbreaking Frontier models, which runs from March 8th to April 6th, highlighting the potential career benefits for participants.
### [Google Co-scientist](https://youtube.com/watch?v=IziXJt5iUHo&t=1346) (0:22:26)
The speaker discusses their reluctance to cover the release of Google's AI co-scientist, emphasizing that while it promises to enhance research across STEM fields, it is still early to evaluate its effectiveness. They highlight concerns about the reliability of its outputs, referencing Gemini Flash 2's issues with hallucinations. Additionally, they quote Demis Hassabis, CEO of Google DeepMind, who stated that we are years away from AI systems being able to generate their own hypotheses, a benchmark for true AGI. The speaker expresses skepticism about current AI's creative capabilities compared to historical figures like Einstein.
### [Humanoid Robot Developments](https://youtube.com/watch?v=IziXJt5iUHo&t=1442) (0:24:02)
The video discusses recent advancements in humanoid robotics, highlighting a significant demo where two robots operated seamlessly on a single neural network. The presenter notes improvements in the fluidity of humanoid robot movements and their integration with language models, suggesting a potential reduction in the timeline between digital and robotic AGI. Additionally, there are mentions of the upcoming GPT 4.5, which is expected to be a larger base model and the last non-chain of thought model before the more integrated GPT 5. The video concludes with a recommendation for viewers to explore other AI-focused content.- Remove fetch keyword
- Add local Whisper transcription fallback if no transcript is available
- Parallelize LLM calls to summarize
- Create dev and main branches. Add Github Action to test and publish to PyPi.