🧠 Starter templates for doing interpretability research
-
Updated
Jul 16, 2023
🧠 Starter templates for doing interpretability research
🦠 DeepDecipher: An open source API to MLP neurons
Mechanistic Interpretability Tutorials, Results and research log as I learn from publicly available research, and experimentation. Evolving work, open ended, slow updates. Lots of incomplete work.
This Alignment Jam Hackathon project explores whether the concept of "logit lens" applies to the encoder and decoder layers in Whisper, an end-to-end speech recognition model.
Add a description, image, and links to the interpretability-jam topic page so that developers can more easily learn about it.
To associate your repository with the interpretability-jam topic, visit your repo's landing page and select "manage topics."