Skip to content

Add Colab notebooks for poison detection and style ablation#215

Open
davidoj wants to merge 2 commits intomainfrom
colab-notebooks-v2
Open

Add Colab notebooks for poison detection and style ablation#215
davidoj wants to merge 2 commits intomainfrom
colab-notebooks-v2

Conversation

@davidoj
Copy link
Copy Markdown
Contributor

@davidoj davidoj commented Mar 27, 2026

Summary

Test plan

  • Verify CI passes
  • Confirm Colab badge links work after merge

🤖 Generated with Claude Code

davidoj and others added 2 commits March 27, 2026 01:35
Two interactive notebooks demonstrating bergson's capabilities:

- poison_detection.ipynb: Injects fictional poison documents into Pile
  training data, fine-tunes Pythia-160M, and uses multi-probe attribution
  to trace the false fact back to poison sources
- style_ablation.ipynb: Demonstrates style vs semantic attribution with
  preconditioner strategies and PCA ablation on Qwen3-0.6B

Both notebooks run on Colab Free (T4, 15GB VRAM).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Notebooks naturally have longer lines that aren't worth wrapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant