Highlights
- Pro
Starred repositories
👋 Aligning Human & Machine Vision using explainability
ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).
Genome modeling and design across all domains of life
MICV-yonsei / VisAttnSink
Forked from seilk/VisAttnSink[ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models
This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking".
Sparsify transformers with SAEs and transcoders
Code for "The Geometry of Concepts: Sparse Autoencoder Feature Structure"
Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography
Biological foundation modeling from molecular to genome scale
Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
[NAACL 2025 Oral] From redundancy to relevance: Enhancing explainability in multimodal large language models
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
The nnsight package enables interpreting and manipulating the internals of deep learned models.
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
This repository collects all relevant resources about interpretability in LLMs
A library for mechanistic interpretability of GPT-style language models
Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
👶🏻 신입 개발자 전공 지식 & 기술 면접 백과사전 📖
Official code and dataset for our paper: Intriguing Properties of Large Language and Vision Models