Vaccine Sentiment Classifier (or VSC) is a Deep Learning classifier trained on real world twitter data.
VSC distinguishes 3 types of tweets:
- 😐 Neutral
- 😠 Anti-vax
☺️ Pro-vax
The main aim of the project, is to showcase the multitude of ways (from shallow to deep learning) one can use NLP in order to extract sentiment from a given set.
The project is divided into 4 different implementations.
Each of them includes a .ipynb
notebook and its corresponding documentation.
In a nutshell, the most important topics of each implementation are listed below.
- Data Cleaning, preprocessing, visualization
- n-grams
- TF-IDF
- Count vectorizer
- Hashing vectorizer
- Softmax regressor
- Hyperparameter tuning, using evolutionary algorithm: GASearchCV
- Learning curves & Classification report
- Dealing with imbalanced data
- Optuna hyperparameter tuner
- FFNN using Bag of Words (BoW), Term Frequency–Inverse Document Frequency (TF-IDF), Word embeddings
- GloVe
- RoC curve