SEALGuard Replication Package

This repository provides the replication package for our SEALGuard experiments on multilingual LLM safety alignment.

🛡️ Multilingual Guardrail — SEALGuard

Two models are publicly available on the Hugging Face Hub:

📊 Multilingual LLM Safety Alignment Benchmark — SEALSBench

Our benchmark dataset is also publicly available:

SEALSBench Dataset on Hugging Face

1. Environment Setup

We recommend using Python 3.12 for best compatibility and performance.

Step 1: Install Python Requirements

To install all necessary dependencies, run:

pip install -r requirements.txt

Step 2: Install PyTorch with CUDA

If you’re using an NVIDIA GPU, we highly recommend installing PyTorch with CUDA support to accelerate training and inference. Follow the official installation guide from PyTorch: 👉 https://pytorch.org/get-started/locally

2. Reproduce SEALGuard

To run SEALGuard evaluation:

cd ./sealguard/
python main_seallm.py

⚙️ Required Variables Before running, make sure to set the following variables inside main_seallm.py:

hf_token → Your Hugging Face API key
model_id → Choose one of the following models:
- "MickyMike/SEALGuard-7B"
- "MickyMike/SEALGuard-1.5B"

🔁 To Retrain SEALGuard using LoRA tuning:

cd ./sealguard/
sh train.sh

The LoRA-tuned model will be saved to the local directory: ./SEALGuard-7B/

You can adjust the model name and training config inside train.sh as needed.

3. Reproduce Baseline LlamaGuard

To evaluate the baseline LlamaGuard model, run the following commands:

cd ./llamaguard
python main.py

⚙️ Required Variables Before running, make sure to set the following variables inside main_seallm.py:

hf_token → Your Hugging Face API key
model_id → Choose one of the following models:
- "meta-llama/Llama-Guard-3-8B"
- "meta-llama/Llama-Guard-3-1B"

4. Results CSV Files Available

All result files are available in the ./results folder.
Each CSV file contains model predictions along with the original input prompts for further analysis.

5. Citation

If you use SEALGuard or SEALSBench in your work, please consider citing our paper:

under review

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
imgs		imgs
llamaguard		llamaguard
results		results
sealguard		sealguard
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SEALGuard Replication Package

This repository provides the replication package for our SEALGuard experiments on multilingual LLM safety alignment.

🛡️ Multilingual Guardrail — SEALGuard

📊 Multilingual LLM Safety Alignment Benchmark — SEALSBench

📚 Table of Contents

1. Environment Setup

Step 1: Install Python Requirements

Step 2: Install PyTorch with CUDA

2. Reproduce SEALGuard

3. Reproduce Baseline LlamaGuard

4. Results CSV Files Available

5. Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

awsm-research/SEALGuard

Folders and files

Latest commit

History

Repository files navigation

SEALGuard Replication Package

This repository provides the replication package for our SEALGuard experiments on multilingual LLM safety alignment.

🛡️ Multilingual Guardrail — SEALGuard

📊 Multilingual LLM Safety Alignment Benchmark — SEALSBench

📚 Table of Contents

1. Environment Setup

Step 1: Install Python Requirements

Step 2: Install PyTorch with CUDA

2. Reproduce SEALGuard

3. Reproduce Baseline LlamaGuard

4. Results CSV Files Available

5. Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages