A web interface for viewing attention layers in language models: load any HF model (or your own) and adjust sliders (adjust slider limits for layer and head in app.js
when using a different model) to view attention pattern at a particular layer and head.
Above example: GPT-2 small's head 2 in layer 6 is an induction head ('y' is selected here and anything that comes after 'y' is highlighted; the same character follows 'y' everywhere, so the scores are higher). Induction heads are hypothesized to be the main driver of in-context learning in large language models.
This project was inspired by Anthropic's A Mathematical Framework for Transformer Circuits and In-context Learning and Induction Heads posts and Neel Nanda's induction mosaic.