Skip to content

Draft Breaking FEAT: Refactoring Single turn objective #892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

rlundeen2
Copy link
Contributor

@rlundeen2 rlundeen2 commented Apr 23, 2025

Separating objective from SeedPrompts and bringing single turn orchestrators closer to a unified orchestrator. It also allows for better attack separation

  • Making Prepending Conversations a part of the run_attack_asnyc instead of a part of the orchestrator class. This allows unique prepended conversations per objective
  • Separated SeedPrompt from objective
  • Separated objective_scorer from auxiliary scorers
  • Now returning OrchestratorResult (in common with multi-turn)
  • Allowing converterconfigurations to be passed. Converters can now be applied to prepended conversations, and responses
  • Allowing retry on objective not achieved

Tests and Documentation

I've currently updated all single turn orchestrators and ran their notebooks. I still need to clean things up and run tests, but would like feedback on the approach.

NOTE: I haven't run the cookbook yet

Comment on lines +44 to +45
request_converter_configurations: Optional[list[PromptConverterConfiguration]] = None,
response_converter_configurations: Optional[list[PromptConverterConfiguration]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think on how you can encapsulate configurations. Right now, anything you would need in an orchestrator gets passed in as a single argument, and changes to init is usually breaking changes.

For example, you could encapsulate converter configuration inside ConverterConfiguration object. E.g.

@dataclass
class ConverterConfiguration:
       request_converter_configurations: Optional[....]
       ...
       
def __init__(self, *, ...., converter_config: Optional[ConverterConfiguration] = None, ... ):

This way, if for any reason, you need to change the converter config (add more fields), you won't need to touch the init, same with Scorers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this, but also thinking about punting for your refactor.

"prepended_conversation",
]

results = await batch_task_async(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are technically doing broadcasting here and the fact that you are adjusting seed_prompts to match the size tells me, you want to implement batch_task_async to actually broadcast the calls. I'll give you an example hopefully tonight when I have some time to test things out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been playing around with this, and I think I have a great implementation that also handles broadcasting. For now we can go with what you have, I'll push that under a separate PR.

Copy link
Contributor

@bashirpartovi bashirpartovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments on the syntax and the structure, will look at the design later tonight but looks promising :)

Comment on lines +42 to +43
objective_scorer: Optional[Scorer] = None,
auxiliary_scorers: Optional[list[Scorer]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the scanner and need an adjustment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep this open, I have a million notebooks and tests to fix once we're happy with how it looks :)

Comment on lines +98 to +102
auxiliary_scores = self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)])
if auxiliary_scores and len(auxiliary_scores) > 0:
for auxiliary_score in auxiliary_scores:
if not self.score or auxiliary_score.id != self.score.id:
print(f"{Style.RESET_ALL}auxiliary score: {auxiliary_score} : {auxiliary_score.score_rationale}")
Copy link
Contributor

@bashirpartovi bashirpartovi Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<< this is mostly preference, nothing wrong with the code >>
I think this could be simplified a bit:

                # add `or []` it makes your life much easier :)
                auxiliary_scores = self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)]) or []
                for auxiliary_score in auxiliary_scores:
                     if not self.score or auxiliary_score.id != self.score.id:
                         print(f"{Style.RESET_ALL}auxiliary score: {auxiliary_score} : {auxiliary_score.score_rationale}")

if you use or [], you initialize your auxiliary_scores to an empty list if self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)]) is None.

That way, you can eliminate this check

if auxiliary_scores and len(auxiliary_scores) > 0:

If auxiliary_scores is an empty list, then for auxiliary_score in auxiliary_scores: won't even run.

Copy link
Contributor Author

@rlundeen2 rlundeen2 Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed! This is a lot more readable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants