You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.
This repository represents the transition from behavioral safety to Neural Forensics. It provides the infrastructure to detect, audit, and mitigate high-order AI risks—such as Latent Deception, Sycophancy-Masking, and Synthetic Intimacy—directly at the mechanistic activation layer.
An experiment in monitoring what LLMs are thinking. Current implementation reads activation-level thoughts via Anthropic's Natural Language Autoencoder release.
An interpretive lens over a collective alignment signal. Claude-powered Next.js app that synthesizes a written Constitution from a corpus of voted visions, identifies tensions, audits proposed model behaviors, and answers researcher queries. Companion to beacn.space.
CORE/Aether — epistemic governance for AI agents. Belief substrate with trust math, contradiction detection, and a measured belief/speech gap. The model is the mouth; the substrate is the self.
Background tracker for the dakshjain-1616 repo portfolio. Monitors stars, forks, traffic, and README freshness across all repos, then surfaces a dashboard of which repositories need attention. Backed by the GitHub REST API and a SQLite store.