Skip to content

Fine-tuned pertained transformers to differentiate between LLM generated and human generated text. Used BERT, DeBERTa, T5 and XLNet models

Notifications You must be signed in to change notification settings

RAnandan10/LLM-Text-Detector

 
 

Repository files navigation

Team20

Team Members: Matt Pauk and Rajagopal Anandan

Our project will focus on SemEval 2024 Task 8, which is a text classification task where the goal is to identify machine-generated text vs human-generated text. We have taken up the Monolingual track. Further details can be found at https://github.com/mbzuai-nlp/SemEval2024-task8.

The SemEval_Task_8_BERT notebook contains the code for the BERT and DeBERTa classification methods. The WordLikelihoodEncoder notebook contains the code to preprocess the text for the Word Likelihood model which is contained in the SemEval_Task_8_Probability_Model file.

The SemEval_Task_8_T5 notebook contains the code for the T5 based classification model and the SemEval_Task_8_XLNet notebook contains the code for the XLNet based classification model.

About

Fine-tuned pertained transformers to differentiate between LLM generated and human generated text. Used BERT, DeBERTa, T5 and XLNet models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%