LLM_detect_AI

In recent years, large language models (LLMs) have become increasingly sophisticated, capable of generating text that is difficult to distinguish from human-written text. This code develops a model that can detect whether a paper was written by a student or a master's degree in law.

Hardware and Software

Kaggle default environment

Datasets

We used 5 datasets as the training set, with only 3 files in the datasets. The links to the other two files are https://www.kaggle.com/datasets/kagglemini/train-00000-of-00001-f9daec1515e5c4b9 (This dataset is sourced from an open-source dataset on Huggingface: https://huggingface.co/datasets/dim/essayforum_writing_prompts_6k/tree/main/) and https://www.kaggle.com/datasets/thedrcat/daigt-v2-train-dataset.

Train and Test

train test code:https://www.kaggle.com/code/wcqyfly/fork-of-fork-of-fork-of-llm-daigt-analyse-e-db6333

The processing and concatenation operations for the daigt v2 train dataset, argugpt, train-00000 of 00001-f9daec1515e5c4b9 datasets are included in this notebook: https://www.kaggle.com/wcqyfly/notebook95c85fa3c6

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
datasets		datasets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM_detect_AI

Hardware and Software

Datasets

Train and Test

train test code:https://www.kaggle.com/code/wcqyfly/fork-of-fork-of-fork-of-llm-daigt-analyse-e-db6333

The processing and concatenation operations for the daigt v2 train dataset, argugpt, train-00000 of 00001-f9daec1515e5c4b9 datasets are included in this notebook: https://www.kaggle.com/wcqyfly/notebook95c85fa3c6

About

Uh oh!

Releases

Packages

wcqy001028/LLM_detect_AI

Folders and files

Latest commit

History

Repository files navigation

LLM_detect_AI

Hardware and Software

Datasets

Train and Test

train test code:https://www.kaggle.com/code/wcqyfly/fork-of-fork-of-fork-of-llm-daigt-analyse-e-db6333

The processing and concatenation operations for the daigt v2 train dataset, argugpt, train-00000 of 00001-f9daec1515e5c4b9 datasets are included in this notebook: https://www.kaggle.com/wcqyfly/notebook95c85fa3c6

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages