YouTube Playlist Processor using Gemini API

✅ Added Language Support, now the output file is in the language of user's input.(might not be as good as english, test it yourself!)
✅ Added single video url support, no need to put it in a playlist.

This Python application extracts transcripts from YouTube playlists and refines them using the Google Gemini API(which is free). It takes a YouTube playlist URL as input, extracts transcripts for each video, and then uses Gemini to reformat and improve the readability of the combined transcript. The output is saved as a text file.

So you can have a neatly formatted book out of a YouTube playlist! I persoanlly use it to convert large YouTube playlists containing dozens of long videos into a very large organized markdown file to give it as input to NotebookLM as one source.

Batch processing of entire playlists
Refine transcripts using Google Gemini API for improved formatting and readability.
User-friendly PyQt5 graphical interface.
Selectable Gemini models.
Output to markdown file.

Features

🎥 Automatic transcript extraction from YouTube playlists
🧠 AI-powered text refinement using Gemini models
📁 Configurable output file paths
⏳ Progress tracking for both extraction and refinement
📄 Output to formatted markdown file.

Requirements

Python 3.9+
Google Gemini API key
YouTube playlist URL

Installation

pip install -r requirements.txt

How does it work?

First, the transcript of every video in the playlist is fetched.
since gemini api doesnt have unlimited context window for input and output, the text for each video gets divided into chunks(right now, chunk size is set to 3000 after testing, but it can be changed)
Each text chunk is then sent to the Gemini API, along with a context prompt that includes the previously refined text. This helps maintain consistency and coherence across chunks.
The refined output from Gemini for each chunk is appended to the final output file.
This process is repeated for every video in the playlist, resulting in a single, refined transcript output file for the entire playlist.

Usage

Get a Gemini API Key: You need a Google Gemini API key. Obtain one from Google AI Studio.
Run the Application:
```
python main.py
```
In the GUI:
- Enter the YouTube Playlist URL in the "YouTube Playlist URL" field.
- Choose output file locations for the transcript and Gemini refined text using the "Browse" buttons.
- Enter your Gemini API key in the "Gemini API Key" field.
- Click "Start Processing".
- You can select a Gemini model.
- Wait for the processing to complete. Progress will be shown in the progress bar and status display.
- The output files will be saved to the locations you specified.

YouTube playlist used for example files : https://www.youtube.com/playlist?list=PLmHVyfmcRKyx1KSoobwukzf1Nf-Y97Rw0

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Images		Images
.gitattributes		.gitattributes
Example_Output.md		Example_Output.md
Example_Transcript.txt		Example_Transcript.txt
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Playlist Processor using Gemini API

Features

Requirements

Installation

How does it work?

Usage

About

Releases

Packages

Languages

eltociear/Youtube-playlist-to-formatted-text

Folders and files

Latest commit

History

Repository files navigation

YouTube Playlist Processor using Gemini API

Features

Requirements

Installation

How does it work?

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages