Skip to content
#

transformer-circuits

Here are 8 public repositories matching this topic...

Language: All
Filter by language

Tracking exactly what happens to the internal "circuitry" (induction heads) of a 2-layer attention-only Transformer when forced to undergo domain adaptation from prose to structured Python code.

  • Updated Jun 18, 2026
  • Python

Open-source EU AI Act Annex IV documentation toolkit. Mechanistic interpretability + circuit discovery for transformers. One function call generates a structured, hash-chained evidence package.

  • Updated Jun 15, 2026
  • Python

Reverse-engineering neural network internals from scratch in NumPy + PyTorch. A 6-week masterclass: linear representation hypothesis, superposition, sparse autoencoders, transformer circuits & induction heads, activation/path patching & causal scrubbing, and steering a real LM. Fully executed notebooks.

  • Updated May 30, 2026
  • Jupyter Notebook

Natural Language Autoencoder (NLA) research prototype inspired by Anthropic’s interpretability work. Implements a scoped approximation of activation verbalization and reconstruction on small open-source LLMs, with quantitative evaluation, baselines, and reproducible local-first experimentation.

  • Updated May 25, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the transformer-circuits topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the transformer-circuits topic, visit your repo's landing page and select "manage topics."

Learn more