Offensive Tweets Classification - OffensEval 2019 - Codalab Challenge

This Challenge was completed in the context of the NLP module from Imperial College taught by Lucia Specia.

Creators of the Project :

Adrian Löwenstein - Mihai Lung - Pierre-Louis Saint

Proposed Implementation :

The Implementation proposed during this project relies on the GloVe word embedding, strong data processing and different networks such as GRU, LTSM or CNN.

Description of the Project

The following description was taken on Codalab : Link

Offensive language is pervasive in social media. Individuals frequently take advantage of the perceived anonymity of computer-mediated communication, using this to engage in behavior that many of them would not consider in real life. Online communities, social media platforms, and technology companies have been investing heavily in ways to cope with offensive language to prevent abusive behavior in social media.

One of the most effective strategies for tackling this problem is to use computational methods to identify offense, aggression, and hate speech in user-generated content (e.g. posts, comments, microblogs, etc.). This topic has attracted significant attention in recent years as evidenced in recent publications (Waseem et al. 2017; Davidson et al., 2017, Malmasi and Zampieri, 2018, Kumar et al. 2018) and workshops such as ALW and TRAC.

In OffensEval we break down offensive content into three sub-tasks taking the type and target of offenses into account.

Sub-tasks

Sub-task A - Offensive language identification
Sub-task B - Automatic categorization of offense types
Sub-task C - Offense target identification.

Data

The data is retrieved from social media and distributed in tab separated format. The trial and traininga data are available in the "Participate" tab. Please register to the competition to download the files.

Participants are allowed to use external resources and other datasets for this task. Please indicate which resources were used when submitting your results.

Description of the Content of this Repo

This repository contains the final submission of the project.

The report explaining the main steps and architectures of the network used
The Jupyter Notebook containing all the code for implementing and training the classifiers.

Results

Worldwide Results of the Competition :

Sub-task A - Rank : 62nd - F1 Score : 0.748752209
Sub-task B - Rank : 5th - F1 Score : 0.715679443
Sub-task C - Rank : 11th - F1 Score : 0.580794253

Imperial College Results of the Competiton :

Sub-task A - Rank : 21st - F1 Score : 0.748752209
Sub-task B - Rank : 1st - F1 Score : 0.715679443
Sub-task C - Rank : 1st - F1 Score : 0.580794253

Best Performing Implementation :

The classifier that performed the best accross the different tasks was the Convolutional Neural Network (CNN)

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
images		images
.gitignore		.gitignore
NLP_Project_2019___Offensive_Language_Classification.pdf		NLP_Project_2019___Offensive_Language_Classification.pdf
Offensive_language_identification_GPU.ipynb		Offensive_language_identification_GPU.ipynb
README.md		README.md
results-breakdown.txt		results-breakdown.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offensive Tweets Classification - OffensEval 2019 - Codalab Challenge

Creators of the Project :

Proposed Implementation :

Description of the Project

Sub-tasks

Data

Description of the Content of this Repo

Results

Worldwide Results of the Competition :

Imperial College Results of the Competiton :

Best Performing Implementation :

Confusion Matrix for each subtask

About

Releases

Packages

Languages

adrianlwn/SemEval-2019-Task-6

Folders and files

Latest commit

History

Repository files navigation

Offensive Tweets Classification - OffensEval 2019 - Codalab Challenge

Creators of the Project :

Proposed Implementation :

Description of the Project

Sub-tasks

Data

Description of the Content of this Repo

Results

Worldwide Results of the Competition :

Imperial College Results of the Competiton :

Best Performing Implementation :

Confusion Matrix for each subtask

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages