Team Members: Matt Pauk and Rajagopal Anandan
Our project will focus on SemEval 2024 Task 8, which is a text classification task where the goal is to identify machine-generated text vs human-generated text. We have taken up the Monolingual track. Further details can be found at https://github.com/mbzuai-nlp/SemEval2024-task8.
The SemEval_Task_8_BERT notebook contains the code for the BERT and DeBERTa classification methods. The WordLikelihoodEncoder notebook contains the code to preprocess the text for the Word Likelihood model which is contained in the SemEval_Task_8_Probability_Model file.
The SemEval_Task_8_T5 notebook contains the code for the T5 based classification model and the SemEval_Task_8_XLNet notebook contains the code for the XLNet based classification model.