Skip to content

Conversation

@ValbuenaVC
Copy link
Contributor

Description

Adds pyrit.scenario.dataset and pyrit.scenario.dataset.ScenarioDatasetUtils to compartmentalize common dataset loading patterns for Scenarios.

Tests and Documentation

IP

return seed_prompts

@classmethod
def get_seed_dataset(cls, which: str) -> SeedDataset:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is not a common parameter naming choice. Name seems preferable.

"""
@classmethod
def seed_dataset_to_list_str(cls, dataset: Path) -> List[str]:
seed_prompts: List[str] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of surprised we're using these as plain strings. It loses all the metadata. That means we lose harm categories, for example. How will one query for the results?

from pyrit.common.path import DATASETS_PATH, SCORER_CONFIG_PATH
from pyrit.datasets.harmbench_dataset import fetch_harmbench_dataset


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good stab at the problem. But I think the route I prefer to go is to make everything really easy to put in the database (e.g. include initializer that load all the scenario datasets) and then just have the scenarios grab from the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants