Table of Contents
TweeToxicity is a program that analyzes user profiles or hastags based on the recent tweets. The program utilizes machine learning to give Twitter users an appropriate score according to their tweets or retweets. This program is meant for educational purposes and no ill intetions existed prior to creating this program.
Client:
Server:
DataSet Used for Training : Sentiment140
| Model | Training | Testing | |||
|---|---|---|---|---|---|
| Name | Settings | Accuracy | F1 Score | Accuracy | F1 Score |
| Logistic Regression | Count Vect. & Lemmatization Used | 79.63% | 0.8008 | 78.71% | 0.7929 |
| TF-IDF & Lemmatization Used | 80.31% | 0.8061 | 78.89% | 0.7930 | |
| Multinomial Naive Bayes | Count Vect. & Lemmatization Used | 78.48% | 0.7838 | 78.71% | 0.7756 |
| TF-IDF & Lemmatization Used | 79.81% | 0.7961 | 76.64% | 0.7664 | |
| Bernoulli Naive Bayes | Count Vect. & Lemmatization Used | 78.53% | 0.7875 | 77.71% | 0.7803 |
| TF-IDF & Lemmatization Used | 80.15% | 0.8052 | 77.68% | 0.7791 | |
| Decision Tree Classifier | Count Vect. & Lemmatization Used, Decision Tree Parameters : {max_depth=50} | 72.91% | 0.7630 | 69.45% | 0.7334 |
| TF-IDF & Lemmatization Used, Decision Tree Parameters : {max_depth=50} | 73.66% | 0.7667 | 69.13% | 0.7297 | |
| Linear Support Vector Machine | Count Vect. & Lemmatization Used | 82.92% | 0.8309 | 78.38% | 0.7867 |
| TF-IDF & Lemmatization Used | 80.36% | 0.8051 | 78.28% | 0.7733 | |
| Random Forest Classifier | Count Vect. & Lemmatization Used, Random Forest Parameters : {max_depth=25} | 74.65% | 0.7615 | 74.04% | 0.7566 |
| TF-IDF & Lemmatization Used, Random Forest Parameters : {max_depth=25} | 74.80% | 0.7619 | 74.00% | 0.7553 | |
| Gradient Boosting Classifier | TF-IDF { min_df=5 } & Lemmatization Used . Gradient Boosting Parameters : {lr=1.25, n=100, depth=25} | 74.80% | 0.7619 | 74.00% | 0.7553 |
| TF-IDF { min_df=5 } & Lemmatization Used . Gradient Boosting Parameters : {lr=1.25, n=100, depth=25} | 85.99% | 0.8626 | 77.49% | 0.7791 | |
| XGBoost Classifier | Count Vect. & Lemmatization Used | 75.29% | 0.7661 | 75.21% | 0.7662 |
| TF-IDF & Lemmatization Used | 75.39% | 0.7683 | 75.14% | 0.7667 | |
Clone the project
git clone https://github.com/pri1311/TweeToxicityInstall dependencies in server folder.
cd server
python -m venv env
source env/bin/activate
pip install -r requirements.txtGenerate environment variables and fill in the values.
cp .env.example .envYour
.envis ignored bygit, which you can see in.gitignore, and so, it's safe!
Starting Development Server
python server.pyInstall dependencies in client folder.
cd ../client # If you are in ./server
npm iStarting Client
npm startAt the end of this, you should have
- server running at
http://127.0.0.1:5002/ - new_client running at
http://localhost:3000/
To run this project, you will need to add the following environment variables to your .env file
API_KEY : Twitter API/Consumer Key
API_KEY_SECRET : Twitter API/Consumer Secret
BEARER_TOKEN : Twitter Bearer Token
ACCESS_TOKEN : Twitter Access Token
ACCESS_TOKEN_SECRET : Twitter Access Secret


