The set of steps implemented:
- text exctraction (transcription) from the youtube video with the latest ASR model: 'openai/whisper-large-v3'
- text topic modeling with the BERTopic and HDBSCAN clustering
- video scene segmentation with PySceneDetect
Use Google Colab Notebook with GPU.
Video timeline visualization with website text: