xunguangwang

Xunguang xunguangwang

Trustworthy

24 followers · 13 following

HKUST
China
https://sites.google.com/view/xunguangwang/

Achievements

Highlights

Stars

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,333 1,729 Updated Sep 11, 2025

SheltonLiu-N / AutoDAN

[ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".

Python 378 50 Updated Jan 22, 2025

QData / TextAttack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

Python 3,265 433 Updated Jul 10, 2025

lapisrocks / rpo

Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"

Python 56 7 Updated Aug 8, 2024

eliotjones1 / robogcg

Official GitHub repository for the paper "Adversarial Attacks on Robotic Vision Language Action Models"

Python 13 3 Updated May 28, 2025

Yu-Fangxu / COLD-Attack

[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability

Python 166 21 Updated Dec 18, 2024

clash-verge-rev / clash-verge-rev

A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience

TypeScript 74,138 5,572 Updated Sep 11, 2025

zhi-xuan-chen / Dia-LLaMA

This is the official GitHub repository of the paper "Dia-LLaMA: Towards Large Language Model-driven CT Report Generation"

Python 9 Updated Jun 29, 2025

zhi-xuan-chen / Reg2RG

This is the official repository for the IEEE TMI paper titled "Large Language Model with Region-Guided Referring and Grounding for CT Report Generation".

Python 42 3 Updated Jun 28, 2025

sleeepeer / PoisonedRAG

[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models

Python 186 23 Updated Feb 23, 2025

LLM-DRA / DRA

[USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction"

Python 104 13 Updated Oct 11, 2024

Wang-Yanting / TracLLM

Python 10 2 Updated Aug 9, 2025

liu00222 / Open-Prompt-Injection

This repository provides a benchmark for prompt Injection attacks and defenses

Python 281 40 Updated Jul 16, 2025

dapurv5 / awesome-red-teaming-llms

Repository accompanying the paper https://openreview.net/pdf?id=sSAp8ITBpC

30 3 Updated Aug 21, 2025

TapXWorld / ChinaTextbook

所有小初高、大学PDF教材。

Roff 49,222 11,068 Updated May 18, 2025

xunguangwang / SoK4JailbreakGuardrails

SoK: Evaluating Jailbreak Guardrails for Large Language Models

Python 14 2 Updated Jul 7, 2025

Beijing-AISI / panda-guard

Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).

Python 46 6 Updated Aug 25, 2025

RobustNLP / CipherChat

A framework to evaluate the generalization capability of safety alignment for LLMs

Python 614 69 Updated Dec 31, 2024

Azure / PyRIT

The Python Risk Identification Tool for generative AI (PyRIT) is an open source framework built to empower security professionals and engineers to proactively identify risks in generative AI systems.

Python 2,873 560 Updated Sep 10, 2025

AI45Lab / ActorAttack

Python 102 8 Updated Feb 3, 2025

CherryHQ / cherry-studio

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

TypeScript 32,930 2,953 Updated Sep 12, 2025

salman-lui / x-teaming

Python 35 3 Updated May 21, 2025

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 9,696 1,018 Updated Apr 30, 2025

billryan / resume

An elegant \LaTeX\ résumé template. 大陆镜像 https://gods.coding.net/p/resume/git

TeX 10,355 2,753 Updated Mar 15, 2024

meta-llama / llama-cookbook

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 17,824 2,601 Updated Sep 10, 2025

IBM / Adversarial-Prompt-Evaluation

Code Implementation of Adversarial Prompt Evaluation paper

Python 12 1 Updated May 7, 2025

bboylyg / BackdoorLLM

BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models

Python 211 20 Updated Aug 16, 2025

xingjunm / Awesome-Large-Model-Safety

Safety at Scale: A Comprehensive Survey of Large Model Safety

188 4 Updated Feb 19, 2025

SaFoLab-WISC / AutoDAN-Turbo

[ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs".

Python 300 46 Updated Apr 15, 2025

aengusl / latent-adversarial-training

Jupyter Notebook 43 12 Updated Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xunguang xunguangwang

Achievements

Achievements

Highlights

Block or report xunguangwang

Stars

openai / gpt-oss

SheltonLiu-N / AutoDAN

QData / TextAttack

lapisrocks / rpo

eliotjones1 / robogcg

Yu-Fangxu / COLD-Attack

clash-verge-rev / clash-verge-rev

zhi-xuan-chen / Dia-LLaMA

zhi-xuan-chen / Reg2RG

sleeepeer / PoisonedRAG

LLM-DRA / DRA

Wang-Yanting / TracLLM

liu00222 / Open-Prompt-Injection

dapurv5 / awesome-red-teaming-llms

TapXWorld / ChinaTextbook

xunguangwang / SoK4JailbreakGuardrails

Beijing-AISI / panda-guard

RobustNLP / CipherChat

Azure / PyRIT

AI45Lab / ActorAttack

CherryHQ / cherry-studio

salman-lui / x-teaming

wdndev / llm_interview_note

billryan / resume

meta-llama / llama-cookbook

IBM / Adversarial-Prompt-Evaluation

bboylyg / BackdoorLLM

xingjunm / Awesome-Large-Model-Safety

SaFoLab-WISC / AutoDAN-Turbo

aengusl / latent-adversarial-training