SubToArticle is a tool that converts subtitles into articles. It allows you to easily transform YouTube video subtitles into well-structured articles in just a few steps.
- Fetch video IDs from YouTube playlists
- Download subtitles from YouTube videos or transcribe audio using
Whisper
. - Generate articles from video subtitles using
ChatGPT
. - Flexible operation modes for full process or individual steps
- Python 3.7 or higher
- Required Python packages:
openai
yt-dlp
CopyCraftAPI
(install from GitHub)
ffmpeg
sqlite3
-
Pull a docker image from docker hub:
docker pull benjiminhsu/subtoarticle:1.0.6
-
Run a docker image as a container
docker run -it --env OPENAI_API_KEY=<your_api_key> subtoarticle:1.0.6 bash
Generate Article for a Single Video
python main.py --video_id <VIDEO_ID>
--subtitle_source: Source of Subtitles for Article Generation
- mp3 (default): Subtitles are generated from the MP3 file using Whisper, providing more stable content.
- subtitle: Uses subtitles provided by the video, which could be either manually uploaded or automatically generated by the system. Not all videos have subtitles, and the quality of auto-generated subtitles may vary. Suitable for generating a large number of articles.
- both: Prioritizes the subtitle mode, falling back to mp3 mode if no subtitles are available.
python main.py --video_id <VIDEO_ID> --subtitle_source subtitle
--mode: Main Mode of Execution for main.py
- full_process (default): Executes the entire process including fetching video information, downloading subtitles, and generating articles.
- fetch_video_id: Fetches video IDs, used for batch downloading information.
python main.py --mode fetch_video_id --download_mode playlist --channel_url @BenHsu501
- download_subtitle: Downloads subtitles. If using --download_mode playlist, it is recommended to use --subtitle_source subtitle to avoid long Whisper processing times.
python main.py --mode download_subtitle --download_mode playlist `--subtitle_source subtitle --channel_url @BenHsu501
- generate_article: Generates an article.
python main.py --mode generate_article --download_mode video_id --video_id <VIDEO_ID> --model gpt-4o