Skip to content

Latest commit

 

History

History
11 lines (8 loc) · 811 Bytes

README.md

File metadata and controls

11 lines (8 loc) · 811 Bytes

kmeans-with-vader

Implementation of Scikit's K-means clustering algorithm with tfidf vectorizer mapped with Vader's sentiment analyzer.

The general purpose of this project is to create an insight from an existing corpus of text by utilizing Scikit's TF-IDF vectorizer, K-means clustering algorithm to cluster the data, and map each cluster to a sentiment with Vader's sentiment analyzer.

Dataset can be changed in the dataset.json file, and must be in the right format as prescribed in the file. Results are generated in the /results folder after running the algorithm.

Thesis paper: An analysis on the insights of the anti-vaccine movement from social media posts using k-means clustering algorithm and VADER sentiment analyzer Paper link: https://ui.adsabs.harvard.edu/abs/2019MS%26E..482a2043G/abstract