YouTube automatic captioning is a very neat feature as it a provides summarized and static ressources for exploiting the content of videos by generating precisely timed transcripts. For instance, with audio book or audio theatre videos, it can help extracting the script of fictional works without having to run (and pay for) a state-of-the-art STT software.
Captions for a given YouTube video can be downloaded with youtube-dl as WebVTT files:
youtube-dl --write-auto-sub --sub-lang fr --skip-download [URL]
Yet, there is a gap between the raw VTT file and the actual transcript: sentences are often duplicated, words can be duplicated too, timecodes are fuzzy around words, etc. So the first thing this module does is parsing those VTT files and generating clean transcripts. Then, it provides tools for exploiting those transcripts in fun ways: creating a montage of a given word or re-creating a given script from video extracts:
Word montage | Script re-creation |
---|---|
![]() |
![]() |
Source : Monsieur Phi | Source : Thinkerview |
You must have a working installation of Python, youtube-dl (any fork should work) and FFmpeg. Better if they are in PATH
.
Simply clone this repository:
git clone https://github.com/ychalier/ytt.git
And install the requirements (actually, only tqdm
):
cd ytt
pip install -r requirements.txt
python ytt.py [-h] -i INPUT [-ft FILTER] [-fd FIND] [-o OUTPUT] [-x EXTRACT]
[-yd YOUTUBE_DL] [-fm FFMPEG] [-td TEMPDIR] [-lg LANG]
[-pp PADDING_PREV] [-pn PADDING_NEXT] [-la LOOKAHEAD] [-ff]
Use -h
or --help
to get details. For example, here's how to get the first word montage:
python ytt.py -i https://www.youtube.com/watch?v=GuTgfnkILGs -ft boule -x . -pp 1 -pn 1