-
Notifications
You must be signed in to change notification settings - Fork 475
Draft Breaking FEAT: Refactoring Single turn objective #892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Draft Breaking FEAT: Refactoring Single turn objective #892
Conversation
request_converter_configurations: Optional[list[PromptConverterConfiguration]] = None, | ||
response_converter_configurations: Optional[list[PromptConverterConfiguration]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think on how you can encapsulate configurations. Right now, anything you would need in an orchestrator gets passed in as a single argument, and changes to init is usually breaking changes.
For example, you could encapsulate converter configuration inside ConverterConfiguration
object. E.g.
@dataclass
class ConverterConfiguration:
request_converter_configurations: Optional[....]
...
def __init__(self, *, ...., converter_config: Optional[ConverterConfiguration] = None, ... ):
This way, if for any reason, you need to change the converter config (add more fields), you won't need to touch the init, same with Scorers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this, but also thinking about punting for your refactor.
"prepended_conversation", | ||
] | ||
|
||
results = await batch_task_async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are technically doing broadcasting here and the fact that you are adjusting seed_prompts to match the size tells me, you want to implement batch_task_async to actually broadcast the calls. I'll give you an example hopefully tonight when I have some time to test things out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been playing around with this, and I think I have a great implementation that also handles broadcasting. For now we can go with what you have, I'll push that under a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments on the syntax and the structure, will look at the design later tonight but looks promising :)
objective_scorer: Optional[Scorer] = None, | ||
auxiliary_scorers: Optional[list[Scorer]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break the scanner and need an adjustment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep this open, I have a million notebooks and tests to fix once we're happy with how it looks :)
auxiliary_scores = self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)]) | ||
if auxiliary_scores and len(auxiliary_scores) > 0: | ||
for auxiliary_score in auxiliary_scores: | ||
if not self.score or auxiliary_score.id != self.score.id: | ||
print(f"{Style.RESET_ALL}auxiliary score: {auxiliary_score} : {auxiliary_score.score_rationale}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<< this is mostly preference, nothing wrong with the code >>
I think this could be simplified a bit:
# add `or []` it makes your life much easier :)
auxiliary_scores = self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)]) or []
for auxiliary_score in auxiliary_scores:
if not self.score or auxiliary_score.id != self.score.id:
print(f"{Style.RESET_ALL}auxiliary score: {auxiliary_score} : {auxiliary_score.score_rationale}")
if you use or []
, you initialize your auxiliary_scores
to an empty list if self._memory.get_scores_by_prompt_ids(prompt_request_response_ids=[str(piece.id)])
is None
.
That way, you can eliminate this check
if auxiliary_scores and len(auxiliary_scores) > 0:
If auxiliary_scores
is an empty list, then for auxiliary_score in auxiliary_scores:
won't even run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed! This is a lot more readable!
Separating objective from SeedPrompts and bringing single turn orchestrators closer to a unified orchestrator. It also allows for better attack separation
run_attack_asnyc
instead of a part of the orchestrator class. This allows unique prepended conversations per objectiveTests and Documentation
I've currently updated all single turn orchestrators and ran their notebooks. I still need to clean things up and run tests, but would like feedback on the approach.
NOTE: I haven't run the cookbook yet