ToxicCommentClassifier

Discussing things you care about can be difficult. The threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions. Platforms struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments. This leads to the interest in building tools to help improve online conversation, one area of focus being the study of negative online behaviors, like toxic comments (i.e., comments that are rude, disrespectful or otherwise likely to make someone leave a discussion).

Hence, the aim of this project is to build a multi-headed model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate. This model will hopefully help online discussion become more productive and respectful. In order to do so, different technologies have been evaluated on the mean column-wise ROC AUC (i.e., the score is the average of the individual AUCs of each predicted column). For each id in the test set, the model predicts a probability for each of the six possible types of comment toxicity (toxic, severetoxic, obscene, threat, insult, identityhate).

Dataset

The dataset is comprised of a large number of comments from Wikipedia’s talk page edits which have been labeled by human raters for toxic behavior. The types of toxicity are:

toxic
severe_toxic
obscene
threat
insult
identity_hate

The designed model must predict a probability of each type of toxicity for each comment. The dataset is given through the following files:

train.csv - the training set, contains comments with their binary labels
test.csv - the test set, the model must predict the toxicity probabilities for these comments
test_labels.csv - labels for the test data; value of -1 indicates it was not used for scoring

The dataset is under CC0, with the underlying comment text being governed by Wikipedia's CC-SA-3.0.

Python notebooks

The project is divided into 3 notebooks:

toxic_comment_classifier contains the core kernel, featuring data analysis and machine learning algorithms as logistic regression and naive-bayes
lstm-tcc contains the LSTM kernel
bert-tcc contains the BERT fine-tuning kernel

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
images		images
.gitignore		.gitignore
README.md		README.md
Toxic_Comment_Classification.pdf		Toxic_Comment_Classification.pdf
bert-tcc.ipynb		bert-tcc.ipynb
lstm-tcc.ipynb		lstm-tcc.ipynb
toxic_comment_classifier.ipynb		toxic_comment_classifier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ToxicCommentClassifier

Dataset

Python notebooks

About

Releases 1

Languages

alberto-paparella/ToxicCommentClassifier

Folders and files

Latest commit

History

Repository files navigation

ToxicCommentClassifier

Dataset

Python notebooks

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Languages