Skip to content

DavidTalevski/youtube-comment-sentiment-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YouTube Comment Sentiment Classifier

This project trains a machine learning model to classify YouTube comments into three categories: Positive, Negative, or Neutral. It includes scripts for training a sentiment analysis model from scratch and for predicting the sentiment of new, unseen comments.

A pre-trained model is included in the /models directory, so you can immediately run predictions without training it yourself.

Features

  • Data Processing: Cleans and preprocesses text data using techniques like stopword removal, lemmatization, and negation handling.
  • Model Training: Trains and tunes multiple classifiers (Logistic Regression, Naive Bayes) and an ensemble Voting Classifier.
  • Model Evaluation: Generates classification reports and confusion matrices to evaluate model performance.
  • Inference: Provides a simple command-line interface to predict the sentiment of your own sentences using the best-trained model.

Getting Started

1. Prerequisites

  • Python 3.8+
  • Git

2. Installation

First, clone the repository to your local machine:

git clone https://github.com/DavidTalevski/youtube-comment-sentiment-classifier
cd youtube-comment-classifier

Next, install the required Python packages using requirements.txt:

pip install -r requirements.txt

3. Data (Optional - For Training)

The model included in this repository is already trained. However, if you wish to train the model yourself, you must download the dataset.

  1. Download the dataset from Kaggle: YouTube Comments Sentiment Dataset.
  2. Place the downloaded file into the /data directory.
  3. Ensure the file is named exactly youtube_comments.csv.

Usage

There are two main ways to use this project.

A. Predict with a Custom Sentence (Using the Pre-trained Model)

To test the included model with your own sentences, run the interactive prediction script:

python src/predict_comment.py

The script will load the saved model from the /models folder and prompt you to enter a comment. Type your sentence and press Enter to see the predicted sentiment. To exit the script, type quit or exit.

B. Train Your Own Model

If you have downloaded the dataset as described above, you can train the model from scratch. This will overwrite the existing model files in the /models directory.

python src/youtube_comment_sentiment.py

The script will preprocess the data, train the classifiers, evaluate their performance, and save the best-performing model for future use.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages