YouTube Comment Sentiment Classifier

This project trains a machine learning model to classify YouTube comments into three categories: Positive, Negative, or Neutral. It includes scripts for training a sentiment analysis model from scratch and for predicting the sentiment of new, unseen comments.

A pre-trained model is included in the /models directory, so you can immediately run predictions without training it yourself.

Features

Data Processing: Cleans and preprocesses text data using techniques like stopword removal, lemmatization, and negation handling.
Model Training: Trains and tunes multiple classifiers (Logistic Regression, Naive Bayes) and an ensemble Voting Classifier.
Model Evaluation: Generates classification reports and confusion matrices to evaluate model performance.
Inference: Provides a simple command-line interface to predict the sentiment of your own sentences using the best-trained model.

Getting Started

1. Prerequisites

Python 3.8+
Git

2. Installation

First, clone the repository to your local machine:

git clone https://github.com/DavidTalevski/youtube-comment-sentiment-classifier
cd youtube-comment-classifier

Next, install the required Python packages using requirements.txt:

pip install -r requirements.txt

3. Data (Optional - For Training)

The model included in this repository is already trained. However, if you wish to train the model yourself, you must download the dataset.

Download the dataset from Kaggle: YouTube Comments Sentiment Dataset.
Place the downloaded file into the /data directory.
Ensure the file is named exactly youtube_comments.csv.

Usage

There are two main ways to use this project.

A. Predict with a Custom Sentence (Using the Pre-trained Model)

To test the included model with your own sentences, run the interactive prediction script:

python src/predict_comment.py

The script will load the saved model from the /models folder and prompt you to enter a comment. Type your sentence and press Enter to see the predicted sentiment. To exit the script, type quit or exit.

B. Train Your Own Model

If you have downloaded the dataset as described above, you can train the model from scratch. This will overwrite the existing model files in the /models directory.

python src/youtube_comment_sentiment.py

The script will preprocess the data, train the classifiers, evaluate their performance, and save the best-performing model for future use.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
models		models
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Comment Sentiment Classifier

Features

Getting Started

1. Prerequisites

2. Installation

3. Data (Optional - For Training)

Usage

A. Predict with a Custom Sentence (Using the Pre-trained Model)

B. Train Your Own Model

About

Uh oh!

Releases

Packages

Languages

DavidTalevski/youtube-comment-sentiment-classifier

Folders and files

Latest commit

History

Repository files navigation

YouTube Comment Sentiment Classifier

Features

Getting Started

1. Prerequisites

2. Installation

3. Data (Optional - For Training)

Usage

A. Predict with a Custom Sentence (Using the Pre-trained Model)

B. Train Your Own Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages