Our project is based on SeqXGPT which is a framework for AI-Generated Text Detection.
The implementation is based on the paper SeqXGPT: Sentence-Level AI-Generated Text Generation
-
Clone this repository and change directory to the project directory.
-
Set up the virtual environment using:
python3 -m venv myenv
source venv/bin/activate- Install the dependencies using:
pip install -r requirements.txtNote: Use python or python3 script in all scripts if you face any issues with running the scripts.
Our approach is based on SeqXGPT which is an advanced technique for sentence-level AI-generated text detection. The model architecture comprises of the following three parts:
- Perplexity(Feature) Extraction
- Feature Encoder
- Linear Classification Layer
Each dataset contains six files. Within each dataset folder, based on the source of AI-generated sentences in the document, they are organized into different files. Below is SeqXGPT-Bench.
A sentence-level AI-generated text (AIGT) detection dataset used for the study of fine-grained AIGT detection.
data format:
{
"text": "Media playback is unsupported on your device 21 June 2013 Last updated at 12:31 BST The Market Hall Cinema in Brynmawr used to be run by the local council but when it announced its funding would stop last month, work began to find a way to keep it going. Thanks to the efforts of a group of local volunteers, the cinema has been saved and reopened under a new community initiative. The group, called \"Brynmawr Foundation\", raised enough funds to take over the lease of the building and purchase new equipment. They plan to show a mix of classic and new films, as well as host events and live performances. The Market Hall Cinema has been an important part of the town's history since it first opened in 1894, and this new initiative ensures that it will continue to be a valuable resource for the community.",
"prompt_len": 254,
"label": "gpt2"
}text refers to an entire document.
prompt_len marks the boundary between the sentences generated by humans and those generated by AI. The first prompt_len characters of the input text, i.e., text[:prompt_len], are the sentences generated by humans, while the rest are generated by a particular language model.
label is the label for each sentence, and there are six types of labels in total: gpt2, gptneo, gptj, llama, gpt3re, human.
Note: For human_lines.jsonl file, the prompt_len field won't be present by default as prompt_len marks the boundary between the sentences generated by humans and those generated by AI.
i.e. The first prompt_len characters of the input text, i.e., text[:prompt_len], are the sentences generated by humans, while the rest are generated by a particular language model.
To add that in the human_lines.jsonl file use the following script:
# --model: [GPT2, GPTJ, GPTNeo]
python3 -m dataset.add_prompt_len_to_human_data --input_file human_lines.jsonl --output_file output.jsonl The above step is crucial before we do the feature extraction and subsequently training, as we use prompt_len in our datahandler code for processing text segments. (src/utilities/datahandler.py)
We extract the features for each input file for each of the three LLMs (GPT-2, GPT-J, GPT-Neo) using dataset/features_extractor_from_data.py file.
Models used: GPT2-xl (1.5B), GPT-Neo (2.7B), GPT-J (6B)
Note: The original paper uses LLaMA (7B) as well for the features extraction but as the pretrained model is not open-sourced by Meta, we weren't able to use it.
We used A100 GPU available with Google Colab's Pro version for this part as feature extraction for the LLMs has high VRAM requirements.
To extract the features from any input data file, please use the following script:
# --model: [GPT2, GPTJ, GPTNeo]
python3 -m dataset.features_extractor_from_data --input_file input.jsonl --output_file output.jsonl --get_features --model model_nameYou can store the features for each LLM (GPT2, GPTJ, GPTNeo) for each of the six input files. Let's say for the file gpt2_lines.jsonl you get the gpt2_lines_gpt2_features.jsonl, gpt2_lines_gptj_features.jsonl, gpt2_lines_gptneo_features.jsonl files and store them under dataset/features/gpt2_lines
So each of these feature files will have the features stored in jsonl format where each json line has the format:
{
"wordwise_loss_list": [[[0.0, 6.484006881713867, 6.484006881713867, 10.804593086242676, 10.804593086242676, 6.741743087768555,.........]],
"text": "high - salt has been shown to play a role in the pathogenesis of autoimmune disease . in this study , we investigated the effect of high - salt on the production of inflammatory mediators by arpe-19 cells and the possible mechanisms involved . arpe-19 cells were cultured with lps in dmem to which extra nacl had been added ( 20 mm and 40 mm ) . a) The addition of sodium chloride ( 1 mmol/ l ) did not influence their growth and (b) lps treated with salt ( 40 mm ) significantly (p<0.01) induced an IC 50 value of 11 nmol TNFa/mL. The addition of lps in dmem for the addition of the salt ( 40 mm ) resulted in a significant increase in the IC 50 in the range of 25 - 33 nmol TNFa/ml.\n\nThe mechanism by which the salt influences arpe-19 cells immunity is not known, however, the authors suggested that the increase in the production of TNFa may result from lysosome-remodeling effect of lps. In this study, arpe-19 cells have been treated for 24 hours with lps at different concentrations which resulted to significantly different results. lps-treated arpe-19 cells grew as compared to untreated cells in the range of 40.3 - 57.3% of the initial culture volume. This increase in proliferation indicates that lps acts via the activation of arpe-19 cells immune mechanism.\n\nLps has also been recently used for the anti-tumour effect in cancer cell line. in this study, we found that Lps can bind to EBOV-1 and -2 and promote the cell death",
"prompt_len": 347,
"label": "gpt2",
"label_int": 0
}Then we merge the wordwise_loss_lists for each feature file and get the complete features file for a particular input file (gpt2_lines_features.jsonl in this case) using the following script
python3 -m dataset.combine_features --directory dataset/gpt2_lines --output_file gpt2_lines_features.jsonlThen we save the features.jsonl file for each input file in a directory dataset/features
A sample json line in this features.jsonl file post combining would look like:
{
"wordwise_loss_list": [[0.0, 6.484006881713867, 6.484006881713867, 10.804593086242676, .........], [7.503107070922852, 7.478128910064697, 4.111625671386719, 4.111625671386719,....], [0.1008826345205307, 0.1008826345205307, 3.0469260215759277, 3.046926259994507, 0.04809051379561424,....]],
"text": "high - salt has been shown to play a role in the pathogenesis of autoimmune disease . in this study , we investigated the effect of high - salt on the production of inflammatory mediators by arpe-19 cells and the possible mechanisms involved . arpe-19 cells were cultured with lps in dmem to which extra nacl had been added ( 20 mm and 40 mm ) . a) The addition of sodium chloride ( 1 mmol/ l ) did not influence their growth and (b) lps treated with salt ( 40 mm ) significantly (p<0.01) induced an IC 50 value of 11 nmol TNFa/mL. The addition of lps in dmem for the addition of the salt ( 40 mm ) resulted in a significant increase in the IC 50 in the range of 25 - 33 nmol TNFa/ml.\n\nThe mechanism by which the salt influences arpe-19 cells immunity is not known, however, the authors suggested that the increase in the production of TNFa may result from lysosome-remodeling effect of lps. In this study, arpe-19 cells have been treated for 24 hours with lps at different concentrations which resulted to significantly different results. lps-treated arpe-19 cells grew as compared to untreated cells in the range of 40.3 - 57.3% of the initial culture volume. This increase in proliferation indicates that lps acts via the activation of arpe-19 cells immune mechanism.\n\nLps has also been recently used for the anti-tumour effect in cancer cell line. in this study, we found that Lps can bind to EBOV-1 and -2 and promote the cell death",
"prompt_len": 347,
"label": "gpt2",
"label_int": 0
}These feature files will be used as the dataset for the SeqXGPT model
As per the architecture discussed in the paper, the code for the model can be found in src/model.py file
# pass the number of training epochs or else it will be 20 by default
python3 -m driver --data_path dataset/features --train_path dataset/train/train.jsonl --test_path dataset/test/test.jsonl --split_dataset --num_train_epochs=100 --gpu=gpu_countWe can change other args like train_ratio, learning rate, warmup_ratio, weight_decay in the above script.
For evaluation, we have provided the trained model file (saved_model.pt), which you can use to run the test scripts directly
Also provided a file dataset/test_data.jsonl (which was generated and processed after the feature extraction step) which can be used for the script. You can generate your own test_files too by the process discussed before.
# Give --document_level_eval as arg if you want documenet level evaluation else it will be sentence level evaluation by default
python3 -m driver --test_path dataset/test/test_data.jsonl --gpu=gpu_count --inference For generation of results for particular model binary AIGT detection, we need test data containing only these pairs of data at a time like:
(gpt2, human), (gpt3, human), (gptneo, human), (gptj, human), (llama, human)
To filter the test.jsonl file we had generated during the training, we use the following script:
# Pass the path to the test.jsonl file and the path and name of the file where you want the test data which should be from only the pairs of labels
# eg. pass --labels gpt2 human, --labels gpt3 human, --labels gptneo human, --labels gptj human, --labels llama human
python3 -m dataset.filter_test_data --input_file dataset/test/test.jsonl --output_file dataset/test/output_test.jsonl --labels name_label_1 name_label_2For the evaluation we have a file evaluator.py where you can choose the calculate_metrics method as per the evaluation you want to do (Multi-class vs Binary)
Just uncomment/comment the different implementation of calculate_metrics as per the comments added.
Note: All the values listed in our table are F1 scores, and Macro-F1 scores to compare the overall performance.
The sentence level resutls are as follows:
Mixed-Model multi-class AIGT Detection
| Method | GPT-2 | GPT-Neo | GPT-J | LLaMA | GPT-3 | Human | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Sniffer | 51.5 | 45.5 | 36.0 | 23.1 | 54.1 | 58.1 | 44.7 |
| Sent-RoBERTa | 43.1 | 31.6 | 31.5 | 42.4 | 78.2 | 90.5 | 52.9 |
| Seq-RoBERTa | 55.5 | 36.7 | 32.0 | 78.6 | 94.4 | 92.3 | 64.9 |
| SeqXGPT | 98.5 | 98.7 | 97.2 | 93.2 | 93.9 | 92.9 | 95.7 |
| Our implementation | 97.6 | 97.5 | 92.3 | 79.3 | 90.4 | 92.0 | 91.6 |
Mixed-Model Binary AIGT Detection
| Method | AI | Human | Macro-F1 |
|---|---|---|---|
| Sniffer | 87.7 | 54.3 | 71.0 |
| Sent-RoBERTa | 97.6 | 92.6 | 95.1 |
| Seq-RoBERTa | 97.4 | 91.8 | 94.6 |
| SeqXGPT | 97.6 | 92.9 | 95.3 |
| Our implementation | 94.9 | 92.0 | 93.49 |
Particular-Model Binary AIGT Detection
| Method | GPT-2 | Human | Macro-F1 | Method | GPT-Neo | Human | Macro-F1 |
|---|---|---|---|---|---|---|---|
| 78.4 | 47.9 | 63.1 | 73.9 | 41.2 | 57.5 | ||
| DetectGPT | 65.8 | 42.9 | 54.3 | DetectGPT | 57.6 | 41.3 | 49.4 |
| Sent-RoBERTa | 92.9 | 75.8 | 84.4 | Sent-RoBERTa | 92.6 | 73.4 | 83.0 |
| SeqXGPT | 98.6 | 95.8 | 97.2 | SeqXGPT | 98.8 | 96.4 | 97.6 |
| Our implementation | 97.9* | 97.3 | 97.6 | Our implementation | 97.8 | 97.4 | 97.6 |
| Method | GPT-J | Human | Macro-F1 | Method | LLaMA | Human | Macro-F1 |
|---|---|---|---|---|---|---|---|
| 76.5 | 34.4 | 55.5 | 69.1 | 27.1 | 48.1 | ||
| DetectGPT | 66.8 | 37.0 | 51.9 | DetectGPT | 52.8 | 47.6 | 50.2 |
| Sent-RoBERTa | 93.1 | 71.8 | 82.4 | Sent-RoBERTa | 89.7 | 69.6 | 79.6 |
| SeqXGPT | 97.9 | 92.9 | 95.4 | SeqXGPT | 96.0 | 89.9 | 92.9 |
| Our implementation | 94.4 | 96.1 | 95.3 | Our implementation | 82.1 | 92.9 | 87.5 |
Mixed-Model multi-class AIGT Detection on Out-of-Distribution(OOD) dataset
For the dataset you can refer dataset/OOD
| Method | GPT-2 | GPT-2-Neo | GPT-J | LLaMA | GPT-3 | Human | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Sniffer | 7.8 | 50.8 | 28.3 | 22.9 | 61.9 | 44.8 | 36.1 |
| Sent-RoBERTa | 32.5 | 18.2 | 24.8 | 30.8 | 74.6 | 30.5 | 35.2 |
| Seq-RoBERTa | 53.7 | 29.0 | 29.9 | 75.2 | 92.3 | 83.6 | 60.6 |
| SeqXGPT | 98.9 | 90.7 | 95.2 | 90.3 | 93.7 | 88.2 | 92.8 |
| Our implementation | 94.4 | 90.4 | 93.4 | 75.0 | 90.9 | 84.5 | 88.1 |
For the dataset you can refer dataset/document-level detection dataset
| Method | GPT-2 | GPT-Neo | GPT-J | LLaMA | GPT-3 | Human | Macro-F1 |
|---|---|---|---|---|---|---|---|
| Sniffer | 85.4 | 84.5 | 74.6 | 13.0 | 81.3 | 66.2 | 67.5 |
| Sent-RoBERTa | 55.8 | 42.8 | 24.4 | 18.3 | 84.7 | 94.6 | 53.4 |
| Seq-RoBERTa | 63.4 | 41.0 | 32.6 | 67.0 | 91.9 | 51.2 | 57.9 |
| SeqXGPT | 99.5 | 99.5 | 98.0 | 93.3 | 90.4 | 84.8 | 94.2 |
| Our implementation | 98.47 | 97.8 | 92.3 | 79.1 | 90.1 | 88.9 | 91.1 |
@misc{wang2023seqxgpt,
title={SeqXGPT: Sentence-Level AI-Generated Text Detection},
author={Pengyu Wang and Linyang Li and Ke Ren and Botian Jiang and Dong Zhang and Xipeng Qiu},
year={2023},
eprint={2310.08903},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
