#

gemma-scope

Here is 1 public repository matching this topic...

tesims / multiagent-emergent-deception

A research tool for studying how deception emerges in multi-agent LLM systems and detecting it through activation analysis.

alignment gemma sparse-autoencoders multi-agent-systems ai-safety emergent-behavior interpretability deception-detection activation-analysis mechanistic-interpretability llm-agents gemma-2b gemma-scope transformer-lens linear-probes

Updated Jan 11, 2026
Python

Improve this page

Add a description, image, and links to the gemma-scope topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gemma-scope topic, visit your repo's landing page and select "manage topics."