Improving Hateful Memes Identification With Ensemble Learning

Code implementation for the group mini project of course EE-559 (4 ECTS - spring 2024). This GitHub repository presents the research of group 31 on the subject "Improving hateful memes identification with Ensemble Learning". The project topic chosed studies the potential benefits of using Ensemble Learning for improving the accuracy of 3 different state-of-the-art models for Hateful Memes detection (binary classification, label 0 = not harmful, label 1 = harmful). The model studied are CLIP, BLIP and ALIGN.

1. How to get the dataset ?

The dataset can be found on the original "Hateful Memes Challenge" website. In the download section, you are asked fill in your full name, email and affiliation informations. The affiliation should be set to 'n/a', in lower case, otherwise the download will fail. The dataset folder 'hateful_memes' should be located at the root of the repository folder, i.e. beside this actual file. The final folder should contain a folder called 'img' and files 'dev_seen.jsonl', 'dev_unseen.jsonl', 'test_seen.jsonl', 'test_unseen.jsonl' and 'train.jsonl'.

2. Presentation of the repository

The file to run is the 'main.ipynb' notebook. It contains all the cells to execute the global pipeline.
In the main file, the 'custom_library.py' file is imported. It contains all the re-used functions such as the dataset creation class, train and test functions, etc.
The 'topic_list.py' file is also imported in main, and contains everything needed handle and generate custom jsonl files used for training, evaluating and testing our models.
In the root, the 'hateful_memes' folder (not committed) contains the images and the .jsonl metadata files. The metadata files contain for each image of the dataset its name (id), its label (hateful (0) / not hateful (1)), and the text contained in the image. The original dataset folder mainly contains the train file (contains images usable for testing) and the test file (images usable for train and validation). In this application only the test_unseen file is used for the test.
The root folder also contains items created while investigating. For example we left the command file used for the use of SCITAS clusters, and some work on VisualBERT that we finally chose not to use.

3. General structure of the repository

Please make sure before running main.ipynb, that your tree structure is the same as the one presented here:

.
└── DeepLearning_HateSpeech/
    ├── ALIGN
    ├── BLIP
    ├── CLIP
    ├── hateful_memes/
    │   ├── img/
    │   │   └── ...
    │   ├── dev_seen.jsonl
    │   ├── dev_unseen.jsonl
    │   ├── test_seen.jsonl
    │   ├── test_unseen.jsonl
    │   └── train.jsonl
    ├── Scitas/
    │   └── job_script.sh
    ├── VBERT/
    │   └── image_captioning_vbert.ipynb
    ├── custom_library.py
    ├── Detecting Hate Speech in Multimodal Memes.pdf
    ├── main.ipynb
    └── README.md

4. Hardware requirements :

(see main.ipynb and report.pdf for detailled explaination)

This project requires the training of 12 large Multimodal models. This can be done locally, but sufficient storage capacyty and computing power are required.

CLIP (650 MB x4): Training the 3 sub-models (for the subsets 'African', 'Muslim' and 'Women'), training the base model (union of the 3 subsets), testing on the unseen test set is done in ~15 min for 5 epochs on an Apple M2 Pro (12 CPU cores and 19 GPU cores), 16Go RAM. All parameters are un-frozzen.
BLIP (990 MB x4): Fine-tuning the classification head, the last 4 layers of the vision model and text decoder for the 4 models for 5 epochs and test on unseen data runs in ~60min on the configuration presented above. Global retraining of the model requires more than 29Go RAM and more powerful GPU.
ALIGN (690 MB x4): Fine-tuning the classification head, the last 9 layers of the vision model and last 4 of the text decoder for the 2 models for 5 epochs and test on unseen data runs in ~30min on the configuration presented above. Global retraining of the model requires more than 20Go RAM and a more powerful GPU.

5. Code demonstration

A screecast of our code running and our trained saved weights can be found here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Improving Hateful Memes Identification With Ensemble Learning

1. How to get the dataset ?

2. Presentation of the repository

3. General structure of the repository

4. Hardware requirements :

5. Code demonstration

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
ALIGN		ALIGN
BLIP		BLIP
CLIP		CLIP
Scitas		Scitas
VBERT		VBERT
.gitignore		.gitignore
Detecting Hate Speech in Multimodal Memes.pdf		Detecting Hate Speech in Multimodal Memes.pdf
Group31_poster.pdf		Group31_poster.pdf
Group31_report.pdf		Group31_report.pdf
README.md		README.md
custom_library.py		custom_library.py
main.ipynb		main.ipynb
topic_list.py		topic_list.py

theohg/DeepLearning_HateSpeech

Folders and files

Latest commit

History

Repository files navigation

Improving Hateful Memes Identification With Ensemble Learning

1. How to get the dataset ?

2. Presentation of the repository

3. General structure of the repository

4. Hardware requirements :

5. Code demonstration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages