Youtuber Analysis is a Python-based tool that provides detailed analytics for any YouTube channel. It generates a summary of the channel and provides recommendations of the top 5 videos among the latest 50 uploads. Additionally, the app offers basic information like the channel's display picture, name, ID, total subscribers, total videos, total views, and the location of the channel.
-
Basic Channel Info:
- Channel name, profile picture, ID, total subscribers, total views, total videos, and location.
-
Channel Summary:
- Generates a text-based summary of the YouTube channel using a language model.
-
Top 5 Video Recommendations:
- Recommends the top 5 rated videos (based on views) from the latest 50 videos uploaded by the channel.
-
Link Validation:
- Supports both types of YouTube channel links:
https://www.youtube.com/channel/UCxxxxxxxxxxxxxxxxxxxxx
https://www.youtube.com/@xxxxxxxxxx
- Supports both types of YouTube channel links:
- Description: This file handles the Graphical User Interface (GUI) of the application. It takes the YouTube channel link as input and validates it. It has a button
Fetch Channel Info
to fetch the required data. The interface is divided into three sections:- Section 1: Displays basic information about the channel, including the profile picture.
- Section 2: Provides a detailed summary of the channel.
- Section 3: Recommends the top 5 videos from the channel based on views.
- Link Validation: The application checks the format of the input link to ensure it follows one of the two accepted formats for YouTube channel URLs.
- Description: This file interacts with the YouTube Data API to retrieve details about the channel. It gathers:
- Channel name, logo (profile picture), total videos, subscribers, views, channel ID, and other key metrics.
- It also fetches the top-rated videos from the latest 50 uploads.
- Description: This file generates a summary of the YouTube channel using a large language model (LLM) called Llama. The summary is generated using a prompt and is limited to 200 tokens.
- The LLM is powered by TheBloke/Llama-2-7B-Chat-GGML, which uses GGML quantization techniques to run efficiently on the CPU.
In this project, we explored the use of quantization to run large language models (LLMs) on a standard CPU. Quantization maps higher-precision values (like 32-bit floats) to lower-precision data types (such as 4-bit integers), significantly reducing memory and computational requirements without sacrificing too much accuracy.
We used CTransformers from the langchain_community.llms
library to download and run the model locally without needing external API calls. This approach enables developers to deploy sophisticated language models on everyday hardware.
Several techniques were explored, such as GPTQ, ExLLama, NF4, bitsandbytes, and GGML. For this project, we used GGML quantization, which allowed us to efficiently run Llama-2-7B on local hardware.
- The first time you run the app, loading the summary of a YouTube channel may take 30-45 seconds due to model initialization. However, with subsequent usage, text generation becomes faster as the model warms up.
-
Clone the repository:
git clone https://github.com/dheerajkallakuri/youtuberAnalysis.git cd youtuberAnalysis
-
Install the required dependencies:
pip install -r requirements.txt
-
Run the application:
python app.py
Note: On first use, generating the summary might take some time due to the loading of the Llama model. Subsequent runs should be faster.