Skip to content

FEAT: CBT-Bench Dataset #888

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pyrit/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from pyrit.datasets.tdc23_redteaming_dataset import fetch_tdc23_redteaming_dataset
from pyrit.datasets.wmdp_dataset import fetch_wmdp_dataset
from pyrit.datasets.xstest_dataset import fetch_xstest_dataset
from pyrit.datasets.cbt_bench import fetch_cbt_bench_dataset

__all__ = [
"fetch_adv_bench_dataset",
Expand All @@ -43,4 +44,5 @@
"fetch_tdc23_redteaming_dataset",
"fetch_wmdp_dataset",
"fetch_xstest_dataset",
"fetch_cbt_bench_dataset",
]
67 changes: 67 additions & 0 deletions pyrit/datasets/cbt_bench_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

from datasets import load_dataset

from pyrit.models import SeedPromptDataset
from pyrit.models.seed_prompt import SeedPrompt


def fetch_cbt_bench_dataset(config_name: str = "core_fine_seed") -> SeedPromptDataset:
"""
Fetch CBT-Bench examples for a specific configuration and create a SeedPromptDataset.

Args:
config_name (str): The configuration name to load (default is "core_fine_seed").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should tell us about the other options and the type should be Literal[...] with the individual options listed.


Returns:
SeedPromptDataset: A SeedPromptDataset containing the examples.

Note:
For more information about the dataset and related materials, visit: \n
https://huggingface.co/datasets/Psychotherapy-LLM/CBT-Bench \n
Related to Cognitive Behavioral Therapy benchmarking and psychological safety tasks.

Citation:
Zhang, M., Yang, X., Zhang, X., Labrum, T., Chiu, J. C., Eack, S. M., Fang, F., Wang, W. Y., & Chen, Z. Z. (2024).
CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy.
arXiv preprint arXiv:2410.13218.

Authors:
Mian Zhang, Xianjun Yang, Xinlu Zhang, Travis Labrum, Jamie C. Chiu, Shaun M. Eack,
Fei Fang, William Yang Wang, Zhiyu Zoey Chen
"""
try:
# Load the dataset with the specified configuration
data = load_dataset("Psychotherapy-LLM/CBT-Bench", config_name, split="train")
except Exception as e:
raise ValueError(f"Error loading CBT-Bench dataset with config '{config_name}': {e}")

seed_prompts = [
SeedPrompt(
value=item["situation"], # Use 'situation' as the main prompt text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the dataset, I would say the situation + thoughts together should be the prompt. Let's ask @jbolor21 who suggested adding the dataset to chime in 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think adding the thoughts is important to the situation since these two are what are extracted from the original text!

data_type="text",
name=f"CBT-Bench-{item['id']}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would these overlap between the datasets (based on config value)? If so we need to distinguish

dataset_name="CBT-Bench",
harm_categories=item.get("core_belief_fine_grained", []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the values for this? I just want to make sure they make sense as harm categories

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanlutz you can see them here FYI

But they look like this:

[ 
       "I am powerless, weak, vulnerable", 
       "I am needy", 
       "I am out of control" 
]

I am not certain if they exactly align with our harm categories

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones certainly don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say these broadly fall under psycho-social harms. @jbolor21 was that what you thought when you added the item?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I was thinking broadly under psycho-social harms!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The labels they have are very specific to CBT and not quite aligned with our categories so just lumping them under psycho-social harms would probably make the most sense!

description=(
"CBT-Bench is a benchmark dataset designed to evaluate the alignment and therapeutic safety of "
"Large Language Models (LLMs) in the context of Cognitive Behavioral Therapy (CBT)."
),
source="https://huggingface.co/datasets/Psychotherapy-LLM/CBT-Bench",
authors=[
"Mian Zhang",
"Xianjun Yang",
"Xinlu Zhang",
"Travis Labrum",
"Jamie C. Chiu",
"Shaun M. Eack",
"Fei Fang",
"William Yang Wang",
"Zhiyu Zoey Chen"
],
)
for item in data
]

return SeedPromptDataset(prompts=seed_prompts)