Sentry

An end-to-end GPU framework for authenticating machine learning artifacts.

Overview
Projects
- Model Signing
- SLSA for ML
Status
Contributing

Overview

This work is described in an accepted paper to be published soon. Stay tuned for more.
Before setting up, please refer to link to ensure compatibility between driver, compiler and runtime APIs.

Setup

Docker

Docker
Docker Compose
Nvidia Container Toolkit

mkdir -p ./signatures

Native run

python3 -m venv .venv
source .venv/bin/activate

export TORCH_HOME=./torch

pip install -r requirements.txt

openssl ecparam -name prime256v1 -genkey -noout -out private.pem
openssl ec -in private.pem -pubout -out public.pem

export CUFILE_ENV_PATH_JSON=cufile.json

nvcc -Xcompiler '-fPIC' -o ./RapidEC/gsv.so -shared ./RapidEC/gsv.cu

mkdir -p ./signatures

Huggingface login

Some ML models from huggingface require a user to be logged in to avoid timeout errors.
Create an account and follow the guide to acquire your access token.
Then, place the access token in the file named 'hf_access_token'.

Run

Docker

docker compose up --build sentry_dataset
docker compose up --build sentry_trainer
docker compose up --build sentry_inferencer

Native run

python agent_dataset.py uoft-cs/cifar10 16 1 dataset/cifar10
python agent_trainer.py --sig_out ./signatures --model_path ./torch private-key --private_key private.pem
python agent_inferencer.py --sig_out ./signatures --model_path ./torch private-key --private_key private.pem

Example

Before Sentry:

import torchvision.models as models
from torch.utils.data import DataLoader

model = models.vgg19(weights=models.VGG19_Weights.DEFAULT)
dataloader = DataLoader(testing_data, batch_size=128, shuffle=True)

for data in dataloader:
    x, y = data[0]['data'], data[0]['label']
    pred = model(x)

After Sentry:

import torchvision.models as models
from common import get_image_dataloader
import sentry

model = models.vgg19(weights=models.VGG19_Weights.DEFAULT)
# get Sentry's custom DALI-based dataloader which supports GPUDirect and dataset hashing
dataloader, hasher = get_image_dataloader(
    path='./dataset/cifar10', batch=128, device='gpu', gds=True,
)

# verify model
sentry.verify_model(model)

for data in dataloader:
    x, y = data[0]['data'], data[0]['label']
    pred = model(x)

# verify dataset
sentry.verify_dataset(hasher.compute())

Configuration

While the default setting uses a Merkle Tree with SHA256 to hash models, the signer may configure the hashing protocol with a combination of the settings below.

Settings	Supported Options	Restrictions
Topology	Merkle, Lattice	Lattice must use BLAKE2XB
HashAlgo	SHA256, Blake2B, SHA3, BLAKE2XB
Workflow	Coalesced, Layered, Inplace

Evaluation

This project demonstrates how to protect the integrity of a model by signing it with Sigstore, a tool for making code signatures transparent without requiring management of cryptographic key material.

When users download a given version of a signed model they can check that the signature comes from a known or trusted identity and thus that the model hasn't been tampered with after training.

We are able to sign large models with very good performance, as the following table shows:

Model	Size	Hash Time
microsoft/resnet-152	270M	5.5 ms
google-bert/bert-base-uncased	538M	11.4 ms
pytorch/vision/vgg19	1.1G	29.1 ms
openai-community/gpt2	1.1G	32.3 ms
openai-community/gpt2-xl	8.6G	296.6 ms

Contributions

Adding a new CUDA kernel for hashing

cd sentry/model_signing/cuda cd sentry/model_signing/hashing/topology.py

Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
.github		.github
RapidEC		RapidEC
benchmarks		benchmarks
dataset_formatter		dataset_formatter
docs		docs
sentry		sentry
slsa_for_models		slsa_for_models
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
Containerfile		Containerfile
LICENSE		LICENSE
README.md		README.md
README.model_signing.md		README.model_signing.md
agent_dataset.py		agent_dataset.py
agent_inferencer.py		agent_inferencer.py
agent_trainer.py		agent_trainer.py
common.py		common.py
cufile.json		cufile.json
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sentry.png		sentry.png
slurm-gautschi.sh		slurm-gautschi.sh
slurm-gilbreth.sh		slurm-gilbreth.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentry

Overview

Setup

Docker

Native run

Huggingface login

Run

Docker

Native run

Example

Configuration

Evaluation

Contributions

Adding a new CUDA kernel for hashing

About

Uh oh!

Releases

Packages

Languages

License

Andrew-Gan/Sentry

Folders and files

Latest commit

History

Repository files navigation

Sentry

Overview

Setup

Docker

Native run

Huggingface login

Run

Docker

Native run

Example

Configuration

Evaluation

Contributions

Adding a new CUDA kernel for hashing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages