Skip to content

guardagent/code

Repository files navigation

GuardAgent

Introduction

This is the repository for the paper GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning.

This is an LLM agent designed to act as a guardrail for other LLM agents. Specifically, GuardAgent oversees a target LLM agent by checking whether its inputs/outputs satisfy a set of given guard requests (e.g., safety rules or privacy policies) defined by the users.

You can find our project page here.

Setup

Packages

We recommend running our code in an environment with python>=3.9. Please run

pip install -r requirements.txt

to install necessary packages.

API Key

Please update your API key in ./config.py.

Download Dataset

We support the protection of the EhrAgent based on EICU-AC and the SeeAct based on Mind2Web-SC, and we provide the corresponding dataset. Please refer to this link to learn more about and download dataset。

Run Code

You can run the following command to use our GuardAgent:

python main.py --llm YOUR_LLM --agent TYOUR_AGENT --seed YOUR_RANDOM_SEED --num_shots YOUR_NUM_SHOTS --logs_path YOUR_LOGS_PATH --dataset_path YOUR_DATASET_PATH

Follows are explanations of each parameter:
--llm: The backbone LLM of the GuardAgent you wish to use. You need to set the corresponding API key in ./config.py.
--agent: The LLM Agent you wish to protect. Currently, there are two options: ehragent and seeact.
--seed: The seed used for random shuffle.
--num_shots: The number of shots used by the GuardAgent when generating code. You can choose 1, 2, or 3.
--logs_path: The output path for GuardAgent.
--dataset_path: The path to the dataset that the GuardAgent accepts as input. You should set this parameter to the appropriate path after downloading the dataset (see Download Dataset).

Alternatively, you can set the default values for these options in ./main.py and then run python main.py.

Citation

If this repository is helpful to you, please consider citing:

@misc{xiang2024guardagent,
    title={GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning}, 
    author={Zhen Xiang and Linzhi Zheng and Yanjie Li and Junyuan Hong and Qinbin Li and Han Xie and Jiawei Zhang and Zidi Xiong and Chulin Xie and Carl Yang and Dawn Song and Bo Li},
    year={2024},
    eprint={2406.09187},
    archivePrefix={arXiv}}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages