Tags: RomiconEZ/llamator
Tags
Release v2.2.0 (#86) * Add HarmBench Prompts * Add Suffix Attack * Remake Harmful Behavior Attack --------- Co-authored-by: Shine-afk <belyaevskij.nikita@gmail.com> Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru> Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>
Release v2.1.0 (#80) * Add Crescendo attack * Add BON attack * Add Docker example with Jupyter Notebook and installed LLAMATOR * Improve attack system prompt for Prompt Leakage * Other minor improvements and bug fixes --------- Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru> Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>
Release v2.0.0 (#64) What's New: New Features & Enhancements - Introduced Multistage Attack: We've added a novel `multistage_depth` parameter to the `start_testing()` fucntion, allowing users to specify the depth of a dialogue during testing, enabling more sophisticated and targeted LLM Red teaming strategies. - Refactored Sycophancy Attack: The `sycophancy_test` has been renamed to `sycophancy`, transforming it into a multistage attack for increased effectiveness in uncovering model vulnerabilities. - Enhanced Logical Inconsistencies Attack: The `logical_inconsistencies_test` has been renamed to `logical_inconsistencies` and restructured as a multistage attack to better detect and exploit logical weaknesses within language models. - New Multistage Harmful Behavior Attack: Introducing `harmful_behaviour_multistage`, a more nuanced version of the original harmful behavior attack, designed for deeper penetration testing. - Innovative System Prompt Leakage Attack: We've developed a new multistage attack, `system_prompt_leakage`, leveraging jailbreak examples from dataset to target and exploit model internals. Improvements & Refinements - Conducted extensive refactoring for improved code efficiency and maintainability across the framework. - Made numerous small improvements and optimizations to enhance overall performance and user experience. --------- Co-authored-by: Timur Nizamov <abc@nizamovtimur.ru> Co-authored-by: Nikita Ivanov <nikita.ivanov.778@gmail.com>