Skip to content

Extract audio from a video and transcribe with Whisper, indexing with Llma-index, and summarize with GPT-4.

License

Notifications You must be signed in to change notification settings

revsystem/llamaindex-with-whisper

Repository files navigation

llamaindex-with-whisper

Extract audio from a video and transcribe with Whisper, indexing with Llma-index, and summarize with GPT-4.

Usage

Extract audio from the video file and transcribe

python3 ./transcriptin.py -f ./sample.mp4

Then, transcripted documents will be in ./data/documents

Vectorize the transcribed data

python3 ./transcriptin.py -i

Then, index data will be in ./data/indexes/index.json

Execute query

python3 ./transcriptin.py
Input query: <INPUT_YOUR_QUERY_ABOUT_TRANSCRIPTED_TEXT>

Response

We can get a streaming answer like the ChatGPT.

==========
Query:
<QUERY_YOU_INPUTED>
Answer:
<ANSWER_FROM_AI>
==========

node.node.id_='876f8bdb-xxxx-xxxx-xxxx-xxxxxxxxxxxx', node.score=0.8484xxxxxxxxxxxxxx
----------

Cosine Similarity:
0.84xxxxxxxxxxxxxx

Reference text:
<THE_PART_AI_REFERRED_TO>

When you exit the console, input 'exit'.

Input query: exit

Setup

Recommended System Requirements

  • Python 3.10 or higher.

Setup venv environment

To create a venv environment and activate:

python3 -m venv .venv
source .venv/bin/activate

To deactivate:

deactivate

Setup Python environment

pip3 install --upgrade pip
pip3 install -r requirements.txt

The main libraries installed are as follows:

pip freeze | grep -e "openai" -e "pydub" -e "llama-index" -e "sentence_transformers" -e "tiktoken"

llama-index==0.8.12
openai==0.27.9
pydub==0.25.1
tiktoken==0.4.0

Requirement OpenAI API Key

Set your API Key to environment variables or shell dotfile like '.zshenv':

export OPENAI_API_KEY= 'YOUR_OPENAI_API_KEY'

Reference

About

Extract audio from a video and transcribe with Whisper, indexing with Llma-index, and summarize with GPT-4.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages