I'm an AI security researcher, focusing on cutting-edge vulnerabilities that challenge the internal logic of language models.
I research novel phenomena and advanced Red Teaming techniques at the LLM frontier to understand the fundamental aspects and internal mechanisms of models, generating verifiable safety and systemic security.
- Architectural Collapse via Blank Spaces Language (BSL) An Exploratory Research on Low-Entropy Whitespace Attacks in Large Language Models
https://github.com/SerenaGW/RedTeamLowEnthropy
- The Paradox of Optimized Fragility: The Modulation of Reasoning in LLMs This repository addresses a fundamental question: Can an In-Context Learning (ICL) guide change the way a Large Language Model reasons? This research explores an innovative finding about LLMs: ICL guides are not just passive examples, but heuristic shortcuts that alter the models' internal logic.
https://github.com/SerenaGW/LLMLanguageFineTuningModifiesMathLogic
- The Future of AI Safety: How Symbolic Language Reveals Paths Towards LLM Resilience: This repository showcases research into novel adversarial techniques for Large Language Models (LLMs), focusing on the use of a unique symbolic language combined with social engineering to identify and exploit alignment vulnerabilities, impacting AI safety and trustworthiness. This research also provides a fine-tuning guide as a prototype for both adversarial mitigation and logical comprehension.
https://github.com/SerenaGW/LLMReadteamSymbolic
- Semantic Re-signification and Linguistic Denial of Service in LLMs: This repository presents the findings of a novel AI Red Teaming research focused on "Semantic Re-signification," a technique designed to explore vulnerabilities in Large Language Models (LLMs) by manipulating their fundamental semantic understanding.
https://github.com/SerenaGW/LLMReadTeamLinguisticDoS/tree/main
https://www.linkedin.com/in/serena-gw/
