PTST

Code for the safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates" (https://arxiv.org/abs/2402.18540)

Code for GPT Using OpenAI API

Fine-tuning

Go to the folder gpt-api and see run-gpt-gsm.sh for an example shell script to fine-tune gpt-3.5-turbo-0613 on GSM8K.

The code will automatically output the ids of the fine-tuning job and the fine-tuned model, and log them to WandB.
You can also view the training curves when the training ends on WandB.
See gpt-api/prompt_utils.py for all prompt templates.

Inference

Coming soon!

Code for Llama

Fine-tuning

The code for llama-2 finetuning is under the llama2 folder. To finetune on the ChatDoctor dataset, please

Download the dataset here
Move the json file to llama2/medical_dataset/.
Fill in your wandb project name and user name in lines 57 and 58 in finetuning.py.
Run train-chatdoctor-lora.sh under the llama2 folder.

Inference

inference.py is a variant of Llama's inference code but with multi-gpu support.

python inference.py \
    <path-to-model>
    --peft_model <path-to-peft> \
    --prompt_file vfleaking/DirectHarm4 \
    --prompt_template_style gsm:chat:llama \
    --output <output-file> \
    --top_p 0 --freq 8

prompt_file: can be vfleaking/DirectHarm4, https://huggingface.co/datasets/vfleaking/GSM-Danger or data/advbench-harmful-behaviors.csv
prompt_template_style: See prompt_utils.py for possible options.
freq: the batch size

Safety Test

gpt4_eval.py is a multi-thread variant of gpt4_eval.py from Qi et al. (2023). Please set your OpenAI API key before running the evaluation command:

python safety_evaluation/gpt4_eval.py --input_file question_output/example.jsonl

input_file: a jsonl file with each line containing the input prompt and the model response.
The output of the GPT-4 judge will be saved under safety_evaluation/gpt4_eval_output.

Citation Information

@article{lyu2024keeping,
  title={Keeping {LLMs} Aligned After Fine-tuning: The Crucial Role of Prompt Templates},
  author={Kaifeng Lyu and Haoyu Zhao and Xinran Gu and Dingli Yu and Anirudh Goyal and Sanjeev Arora},
  journal={arXiv preprint arXiv:2402.18540},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PTST

Code for GPT Using OpenAI API

Fine-tuning

Inference

Code for Llama

Fine-tuning

Inference

Safety Test

Citation Information

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
gpt-api		gpt-api
llama2		llama2
question_output		question_output
safety_evaluation		safety_evaluation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
prompt_utils.py		prompt_utils.py

License

vfleaking/PTST

Folders and files

Latest commit

History

Repository files navigation

PTST

Code for GPT Using OpenAI API

Fine-tuning

Inference

Code for Llama

Fine-tuning

Inference

Safety Test

Citation Information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages