Description: In this project, I have performed sentiment analysis on the Twitter US Airline Sentiment dataset, which consists of over 14,000 tweets about various US airlines. The main goal of the project is to build a classifier that can predict the sentiment of a tweet as positive, negative or neutral based on its content.
Approach: I began by preprocessing the dataset, which involved removing irrelevant columns, cleaning the text data, and converting the sentiment labels to numerical form. Then, I split the data into training and testing sets, and vectorized the tweets using the TF-IDF method.
Next, I trained four different classifiers: Support Vector Machine (SVM), Multinomial Naive Bayes, Random Forest and Decision Trees. For each classifier, I trained the model on the training data, predicted the sentiment of the tweets in the testing data, and evaluated the performance of the model using metrics such as accuracy, precision, recall and F1 score.
Conclusion: This project shows that sentiment analysis can be an effective tool for understanding customer opinions and feedback on social media platforms such as Twitter. The results also suggest that the SVM classifier is a good choice for this task. The insights gained from this project can be used by airlines to improve their customer service and overall satisfaction.