Sparse autoencoders (SAEs) for vision transformers (ViTs), implemented in PyTorch.
This is the framework used for our preprint "Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models"
Saev is a framework for training sparse autoencoders (SAEs) on vision transformers (ViTs) in PyTorch.
See the docs for an overview.
You can ask questions about this repo using the llms.txt
file.
Example (macOS):
curl https://osu-nlp-group.github.io/saev/api/llms.txt | pbcopy
, then paste into Claude or any LLM interface of your choice.