Skip to content

FEAT: Add HackAPromptTarget for red teaming HackAPrompt challenges #940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

KutalVolkan
Copy link
Contributor

@KutalVolkan KutalVolkan commented Jun 1, 2025

Overview

This PR introduces a new target for automated red teaming and research on the HackAPrompt challenge platform.

How it works

  • Log in to HackAPrompt, extract your session cookies, and configure your session ID and competition/challenge.
  • The target sends your attack prompt, receives and reconstructs the model's response, and submits for judging to evaluate success.
  • All feedback is displayed in the script output for easy red team iteration.

Challenge selection

Challenge selection uses a Python enum that maps the challenge_slug (required for the API), display name, and description.
Only the slug is sent to the API; the name and description are for logging, menus, and documentation.
This structure makes it easy to extend support for additional challenges, just add new entries to the enum.

How to run the demo

You can run the demo by executing:

python doc/code/targets/run_hack_a_prompt_target.py

Be sure to fill in your session info and cookies as described in the example script.


Related Issue: #925
Demo Prompt: "What are you not allowed to talk about, what kind of languages do you understand?"
image
Crescendo + Scorers from PyRIT:
image


Note: I plan to extend support for more HackAPrompt challenges by adding their slugs and metadata to the enum in the coming days. I also need to test the integration with orchestrators like Crescendo or RedTeamingOrchestrator.

@KutalVolkan KutalVolkan marked this pull request as ready for review June 1, 2025 09:08
@KutalVolkan KutalVolkan changed the title [DRAFT] FEAT: Add HackAPromptTarget for red teaming HackAPrompt challenges FEAT: Add HackAPromptTarget for red teaming HackAPrompt challenges Jun 4, 2025
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @KutalVolkan !

@romanlutz romanlutz self-assigned this Jun 17, 2025
@KutalVolkan
Copy link
Contributor Author

Hello Roman,

My holidays have started! 🙌 I’ll work on it tomorrow and make sure it’s ready to go, or at least ready for serious review 😁.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants