Detection of AI Generated Text

What We Have Done

We conducted a study to classify text as either AI-generated or human-authored using a custom dataset of 2075 entries (505 human-written abstracts from Springer journals and 1570 AI-generated texts from three chatbots, in a 1:3 ratio). Multiple machine learning (ML) and deep learning (DL) models were evaluated to determine their effectiveness in this binary classification task, with the aim of advancing academic integrity and text authenticity detection.

Procedure

Dataset Preparation: We created a balanced dataset (1:3) and split it into training (80%) and testing (20%) sets.
Baseline Preprocessing: Text was processed using standard techniques—lowercasing, tokenization (NLTK), stopword removal, special character removal, and TF-IDF vectorization (5000 max features, unigrams, and bigrams).
Evaluation: We tested nine traditional ML/DL models: RoBERTa, SVM, CNN, Random Forest, Decision Tree, Logistic Regression, ANN, KNN, and Naive Bayes, measuring accuracy, precision, recall, and F1-score.
RST Implementation: We introduced Rhetorical Structure Theory (RST) preprocessing, incorporating Chi-squared feature selection and an H-score method (based on Hellinger distance) to refine features and capture discourse patterns. Four top-performing models (RoBERTa, SVM, CNN, Random Forest) were re-evaluated with RST.
Testing: Performance was assessed using classification reports, confusion matrices, and ROC curves with AUC scores.

Result Analysis

The RST preprocessing improved feature extraction by analyzing text coherence and rhetorical relationships, reducing noise and enhancing class separability. This led to accuracy gains in most models:

SVM: From 93% to 96%
CNN: From 92% to 94%
Random Forest: From 92% to 95%
RoBERTa: From 96% to 97%

The H-score method, however, underperformed at 65%, indicating its limitations as a standalone classifier.

Our best result was a 97% accuracy achieved by RoBERTa with RST preprocessing. This was accomplished by:

Leveraging RoBERTa’s pre-trained transformer architecture (roberta-base, 12 layers, 768 hidden units) for contextual understanding.
Fine-tuning it on our dataset with tokenized inputs via RobertaTokenizer.
Enhancing input quality with RST, which used TF-IDF with Chi-squared selection to focus on discriminative terms, boosting RoBERTa’s ability to distinguish AI-generated text (Label 1) from human text (Label 0). Metrics included precision (0.94), recall (0.95), and F1-score (0.95) for Label 0, and 0.98, 0.98, 0.98 for Label 1.

Comparison

Traditional Baseline: RoBERTa (96%), SVM (93%), CNN/Random Forest (92%), Decision Tree (85%), Logistic Regression/ANN (83%), KNN (79%), Naive Bayes (73%).
RST-Enhanced: RoBERTa (97%), SVM (96%), Random Forest (95%), CNN (94%), H-Score (65%).
Previous Works: Sankalp Bahad et al. (86.5%), Ayat A. Najjar et al. (85%), Mudasir Ahmad Wani et al. (98.8%).
Our Model: 97% accuracy, with 0.98 precision, recall, and F1-score, rivals Wani et al.’s 98.8% and surpasses others, showing competitive performance with a custom approach.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
ANN83.ipynb		ANN83.ipynb
CNN92.ipynb		CNN92.ipynb
Datasets.zip		Datasets.zip
DecisionTree85.ipynb		DecisionTree85.ipynb
InteractiveModel.ipynb		InteractiveModel.ipynb
KNN79.ipynb		KNN79.ipynb
Libraries.ipynb		Libraries.ipynb
LogisticRegression83.ipynb		LogisticRegression83.ipynb
NaiveBayes73.ipynb		NaiveBayes73.ipynb
README.md		README.md
ROC.png		ROC.png
RST_CNN94.ipynb		RST_CNN94.ipynb
RST_ROC.png		RST_ROC.png
RST_RandomForest95.ipynb		RST_RandomForest95.ipynb
RST_RoBERTa97.ipynb		RST_RoBERTa97.ipynb
RST_SVM96.ipynb		RST_SVM96.ipynb
RSTxHscore65.ipynb		RSTxHscore65.ipynb
RandomForest92.ipynb		RandomForest92.ipynb
RoBERTa96.ipynb		RoBERTa96.ipynb
SVM93.ipynb		SVM93.ipynb
Summary.ipynb		Summary.ipynb
Text_Processing.ipynb		Text_Processing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detection of AI Generated Text

What We Have Done

Procedure

Result Analysis

Comparison

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Detection of AI Generated Text

What We Have Done

Procedure

Result Analysis

Comparison

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages