Pusblished code on Kaggle: https://www.kaggle.com/code/siddharthmandgi/twitter-sentiment-analysis/notebook
In this project we understand and predicting Tweet Sentiments across various entities (topics) such as tech, gaming and social media. These Sentiments are categorized as 'Positive, 'Negative', 'Neutral' or 'Irrelevant'.
This is a Kaggle dataset (https://www.kaggle.com/datasets/jp797498e/twitter-entity-sentiment-analysis) with two files training.csv and validation.csv Features :- Entity, Tweet ID, Tweet and Sentiment.
Discovered Insights such as:
- What is the most common sentiment across all tweets?
- Most Tweets are of Negative Sentiment. Positve and Neutral Tweets closely follow.
-
What are the most commonly tweeted topics?
- Most Tweets revolve arround Gaming, Tech and Social Media in this Dataset
-
Ten Most commonly talked about topics and their overall sentiment?
Used a Sentence Transformer with a Random Forest Classification Head to classify sentiments
Used Lime a Model Agnostic Explainer to explain word level impact on different classes of the model