Sentence-Level-AI-Generated-Text-Detection

Our project is based on SeqXGPT which is a framework for AI-Generated Text Detection.

The implementation is based on the paper SeqXGPT: Sentence-Level AI-Generated Text Generation

Setting up the project

Clone this repository and change directory to the project directory.
Set up the virtual environment using:

python3 -m venv myenv
source venv/bin/activate

Install the dependencies using:

pip install -r requirements.txt

Note: Use python or python3 script in all scripts if you face any issues with running the scripts.

Methodology

Our approach is based on SeqXGPT which is an advanced technique for sentence-level AI-generated text detection. The model architecture comprises of the following three parts:

Perplexity(Feature) Extraction
Feature Encoder
Linear Classification Layer

Datasets

Each dataset contains six files. Within each dataset folder, based on the source of AI-generated sentences in the document, they are organized into different files. Below is SeqXGPT-Bench.

SeqXGPT-Bench

A sentence-level AI-generated text (AIGT) detection dataset used for the study of fine-grained AIGT detection.

data format:

{
    "text": "Media playback is unsupported on your device 21 June 2013 Last updated at 12:31 BST The Market Hall Cinema in Brynmawr used to be run by the local council but when it announced its funding would stop last month, work began to find a way to keep it going. Thanks to the efforts of a group of local volunteers, the cinema has been saved and reopened under a new community initiative. The group, called \"Brynmawr Foundation\", raised enough funds to take over the lease of the building and purchase new equipment. They plan to show a mix of classic and new films, as well as host events and live performances. The Market Hall Cinema has been an important part of the town's history since it first opened in 1894, and this new initiative ensures that it will continue to be a valuable resource for the community.", 
    "prompt_len": 254, 
    "label": "gpt2"
}

text refers to an entire document.

prompt_len marks the boundary between the sentences generated by humans and those generated by AI. The first prompt_len characters of the input text, i.e., text[:prompt_len], are the sentences generated by humans, while the rest are generated by a particular language model.

label is the label for each sentence, and there are six types of labels in total: gpt2, gptneo, gptj, llama, gpt3re, human.

Note: For human_lines.jsonl file, the prompt_len field won't be present by default as prompt_len marks the boundary between the sentences generated by humans and those generated by AI. i.e. The first prompt_len characters of the input text, i.e., text[:prompt_len], are the sentences generated by humans, while the rest are generated by a particular language model.

To add that in the human_lines.jsonl file use the following script:

# --model: [GPT2, GPTJ, GPTNeo]
python3 -m dataset.add_prompt_len_to_human_data --input_file human_lines.jsonl --output_file output.jsonl

The above step is crucial before we do the feature extraction and subsequently training, as we use prompt_len in our datahandler code for processing text segments. (src/utilities/datahandler.py)

Feature Extraction

We extract the features for each input file for each of the three LLMs (GPT-2, GPT-J, GPT-Neo) using dataset/features_extractor_from_data.py file.

Models used: GPT2-xl (1.5B), GPT-Neo (2.7B), GPT-J (6B)

Note: The original paper uses LLaMA (7B) as well for the features extraction but as the pretrained model is not open-sourced by Meta, we weren't able to use it.

We used A100 GPU available with Google Colab's Pro version for this part as feature extraction for the LLMs has high VRAM requirements.

To extract the features from any input data file, please use the following script:

# --model: [GPT2, GPTJ, GPTNeo]
python3 -m dataset.features_extractor_from_data --input_file input.jsonl --output_file output.jsonl --get_features --model model_name

You can store the features for each LLM (GPT2, GPTJ, GPTNeo) for each of the six input files. Let's say for the file gpt2_lines.jsonl you get the gpt2_lines_gpt2_features.jsonl, gpt2_lines_gptj_features.jsonl, gpt2_lines_gptneo_features.jsonl files and store them under dataset/features/gpt2_lines

So each of these feature files will have the features stored in jsonl format where each json line has the format:

{
  "wordwise_loss_list": [[[0.0, 6.484006881713867, 6.484006881713867, 10.804593086242676, 10.804593086242676, 6.741743087768555,.........]],
  "text": "high - salt has been shown to play a role in the pathogenesis of autoimmune disease . in this study , we investigated the effect of high - salt on the production of inflammatory mediators by arpe-19 cells and the possible mechanisms involved . arpe-19 cells were cultured with lps in dmem to which extra nacl had been added ( 20  mm and 40  mm ) . a) The   addition of sodium chloride ( 1 mmol/ l ) did not influence their growth and (b) lps treated with salt ( 40 mm ) significantly (p<0.01) induced an IC 50 value of 11 nmol TNFa/mL. The addition of lps in dmem for the addition of the salt ( 40 mm ) resulted in a significant increase in the IC 50 in the range of 25 - 33 nmol TNFa/ml.\n\nThe mechanism by which the      salt influences arpe-19 cells immunity is not known, however, the authors suggested that the increase in the production of TNFa may result from lysosome-remodeling effect of lps. In this study, arpe-19 cells have been treated for 24 hours with lps at different concentrations which resulted to significantly different results. lps-treated arpe-19 cells grew as       compared to untreated cells in the range of 40.3 - 57.3% of the initial culture volume. This increase in proliferation indicates that lps acts via the activation of arpe-19 cells immune mechanism.\n\nLps has also been recently used for the anti-tumour effect in cancer cell line. in this study, we found that Lps can bind to EBOV-1 and -2 and promote the cell        death", 
  "prompt_len": 347, 
  "label": "gpt2",
  "label_int": 0
}

Then we merge the wordwise_loss_lists for each feature file and get the complete features file for a particular input file (gpt2_lines_features.jsonl in this case) using the following script

python3 -m dataset.combine_features --directory dataset/gpt2_lines --output_file gpt2_lines_features.jsonl

Then we save the features.jsonl file for each input file in a directory dataset/features

A sample json line in this features.jsonl file post combining would look like:

{
  "wordwise_loss_list": [[0.0, 6.484006881713867, 6.484006881713867, 10.804593086242676, .........], [7.503107070922852, 7.478128910064697, 4.111625671386719, 4.111625671386719,....], [0.1008826345205307, 0.1008826345205307, 3.0469260215759277, 3.046926259994507, 0.04809051379561424,....]],
  "text": "high - salt has been shown to play a role in the pathogenesis of autoimmune disease . in this study , we investigated the effect of high - salt on the production of inflammatory mediators by arpe-19 cells and the possible mechanisms involved . arpe-19 cells were cultured with lps in dmem to which extra nacl had been added ( 20  mm and 40  mm ) . a) The   addition of sodium chloride ( 1 mmol/ l ) did not influence their growth and (b) lps treated with salt ( 40 mm ) significantly (p<0.01) induced an IC 50 value of 11 nmol TNFa/mL. The addition of lps in dmem for the addition of the salt ( 40 mm ) resulted in a significant increase in the IC 50 in the range of 25 - 33 nmol TNFa/ml.\n\nThe mechanism by which the      salt influences arpe-19 cells immunity is not known, however, the authors suggested that the increase in the production of TNFa may result from lysosome-remodeling effect of lps. In this study, arpe-19 cells have been treated for 24 hours with lps at different concentrations which resulted to significantly different results. lps-treated arpe-19 cells grew as       compared to untreated cells in the range of 40.3 - 57.3% of the initial culture volume. This increase in proliferation indicates that lps acts via the activation of arpe-19 cells immune mechanism.\n\nLps has also been recently used for the anti-tumour effect in cancer cell line. in this study, we found that Lps can bind to EBOV-1 and -2 and promote the cell        death", 
  "prompt_len": 347, 
  "label": "gpt2",
  "label_int": 0
}

These feature files will be used as the dataset for the SeqXGPT model

Feature Encoder and Linear Classification Layer

As per the architecture discussed in the paper, the code for the model can be found in src/model.py file

Training the SeqXGPT model

# pass the number of training epochs or else it will be 20 by default
python3 -m driver --data_path dataset/features --train_path dataset/train/train.jsonl --test_path dataset/test/test.jsonl --split_dataset --num_train_epochs=100 --gpu=gpu_count

We can change other args like train_ratio, learning rate, warmup_ratio, weight_decay in the above script.

Evaluation

For evaluation, we have provided the trained model file (saved_model.pt), which you can use to run the test scripts directly

Also provided a file dataset/test_data.jsonl (which was generated and processed after the feature extraction step) which can be used for the script. You can generate your own test_files too by the process discussed before.

# Give --document_level_eval as arg if you want documenet level evaluation else it will be sentence level evaluation by default
python3 -m driver --test_path dataset/test/test_data.jsonl --gpu=gpu_count --inference

Miscellanous

For generation of results for particular model binary AIGT detection, we need test data containing only these pairs of data at a time like:

(gpt2, human), (gpt3, human), (gptneo, human), (gptj, human), (llama, human)

To filter the test.jsonl file we had generated during the training, we use the following script:

# Pass the path to the test.jsonl file and the path and name of the file where you want the test data which should be from only the pairs of labels
# eg. pass --labels gpt2 human, --labels gpt3 human, --labels gptneo human, --labels gptj human, --labels llama human
python3 -m dataset.filter_test_data --input_file dataset/test/test.jsonl --output_file dataset/test/output_test.jsonl --labels name_label_1 name_label_2

For the evaluation we have a file evaluator.py where you can choose the calculate_metrics method as per the evaluation you want to do (Multi-class vs Binary)

Just uncomment/comment the different implementation of calculate_metrics as per the comments added.

Results

Note: All the values listed in our table are F1 scores, and Macro-F1 scores to compare the overall performance.

1. Sentence-level results

The sentence level resutls are as follows:

For SeqXGPT-Bench dataset:

Mixed-Model multi-class AIGT Detection

Method	GPT-2	GPT-Neo	GPT-J	LLaMA	GPT-3	Human	Macro-F1
Sniffer	51.5	45.5	36.0	23.1	54.1	58.1	44.7
Sent-RoBERTa	43.1	31.6	31.5	42.4	78.2	90.5	52.9
Seq-RoBERTa	55.5	36.7	32.0	78.6	94.4	92.3	64.9
SeqXGPT	98.5	98.7	97.2	93.2	93.9	92.9	95.7
Our implementation	97.6	97.5	92.3	79.3	90.4	92.0	91.6

Mixed-Model Binary AIGT Detection

Method	AI	Human	Macro-F1
Sniffer	87.7	54.3	71.0
Sent-RoBERTa	97.6	92.6	95.1
Seq-RoBERTa	97.4	91.8	94.6
SeqXGPT	97.6	92.9	95.3
Our implementation	94.9	92.0	93.49

Particular-Model Binary AIGT Detection

Method	GPT-2	Human	Macro-F1	Method	GPT-Neo	Human	Macro-F1
$log$ $p(x)$	78.4	47.9	63.1	$log$ $p(x)$	73.9	41.2	57.5
DetectGPT	65.8	42.9	54.3	DetectGPT	57.6	41.3	49.4
Sent-RoBERTa	92.9	75.8	84.4	Sent-RoBERTa	92.6	73.4	83.0
SeqXGPT	98.6	95.8	97.2	SeqXGPT	98.8	96.4	97.6
Our implementation	97.9*	97.3	97.6	Our implementation	97.8	97.4	97.6

Method	GPT-J	Human	Macro-F1	Method	LLaMA	Human	Macro-F1
$log$ $p(x)$	76.5	34.4	55.5	$log$ $p(x)$	69.1	27.1	48.1
DetectGPT	66.8	37.0	51.9	DetectGPT	52.8	47.6	50.2
Sent-RoBERTa	93.1	71.8	82.4	Sent-RoBERTa	89.7	69.6	79.6
SeqXGPT	97.9	92.9	95.4	SeqXGPT	96.0	89.9	92.9
Our implementation	94.4	96.1	95.3	Our implementation	82.1	92.9	87.5

For Out-of-Distribution (OOD) dataset:

Mixed-Model multi-class AIGT Detection on Out-of-Distribution(OOD) dataset

For the dataset you can refer dataset/OOD

Method	GPT-2	GPT-2-Neo	GPT-J	LLaMA	GPT-3	Human	Macro-F1
Sniffer	7.8	50.8	28.3	22.9	61.9	44.8	36.1
Sent-RoBERTa	32.5	18.2	24.8	30.8	74.6	30.5	35.2
Seq-RoBERTa	53.7	29.0	29.9	75.2	92.3	83.6	60.6
SeqXGPT	98.9	90.7	95.2	90.3	93.7	88.2	92.8
Our implementation	94.4	90.4	93.4	75.0	90.9	84.5	88.1

2. Document-level results On Document-level detection dataset

For the dataset you can refer dataset/document-level detection dataset

Method	GPT-2	GPT-Neo	GPT-J	LLaMA	GPT-3	Human	Macro-F1
Sniffer	85.4	84.5	74.6	13.0	81.3	66.2	67.5
Sent-RoBERTa	55.8	42.8	24.4	18.3	84.7	94.6	53.4
Seq-RoBERTa	63.4	41.0	32.6	67.0	91.9	51.2	57.9
SeqXGPT	99.5	99.5	98.0	93.3	90.4	84.8	94.2
Our implementation	98.47	97.8	92.3	79.1	90.1	88.9	91.1

References

@misc{wang2023seqxgpt,
      title={SeqXGPT: Sentence-Level AI-Generated Text Detection}, 
      author={Pengyu Wang and Linyang Li and Ke Ren and Botian Jiang and Dong Zhang and Xipeng Qiu},
      year={2023},
      eprint={2310.08903},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
dataset		dataset
src		src
AI-generated-text-detection.png		AI-generated-text-detection.png
README.md		README.md
driver.py		driver.py
requirements.txt		requirements.txt
saved_model.pt		saved_model.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Sentence-Level-AI-Generated-Text-Detection

Setting up the project

Methodology

Datasets

SeqXGPT-Bench

Feature Extraction

We used A100 GPU available with Google Colab's Pro version for this part as feature extraction for the LLMs has high VRAM requirements.

Feature Encoder and Linear Classification Layer

Training the SeqXGPT model

Evaluation

Miscellanous

Results

1. Sentence-level results

For SeqXGPT-Bench dataset:

For Out-of-Distribution (OOD) dataset:

2. Document-level results On Document-level detection dataset

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

harsha148/Sentence-Level-AI-Generated-Text-Detection

Folders and files

Latest commit

History

Repository files navigation

Sentence-Level-AI-Generated-Text-Detection

Setting up the project

Methodology

Datasets

SeqXGPT-Bench

Feature Extraction

We used A100 GPU available with Google Colab's Pro version for this part as feature extraction for the LLMs has high VRAM requirements.

Feature Encoder and Linear Classification Layer

Training the SeqXGPT model

Evaluation

Miscellanous

Results

1. Sentence-level results

For SeqXGPT-Bench dataset:

For Out-of-Distribution (OOD) dataset:

2. Document-level results On Document-level detection dataset

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages