Hallucination Detector

A comprehensive implementation of state-of-the-art methods for detecting hallucinations in Large Language Model outputs. This toolkit reproduces and extends detection methodologies from research and industry, providing multiple approaches to identify factual errors, contradictions, and unsupported claims in AI-generated content.

Overview

This toolkit provides robust detection methods that work with any LLM-generated content, regardless of the underlying system architecture.

Hallucination in LLMs refers to the generation of content that appears plausible but is factually incorrect, unsupported by the input context, or entirely fabricated. This toolkit addresses the critical need for reliable hallucination detection across diverse applications and use cases.

Why Hallucination Detection Matters

Trust & Safety: Ensure AI systems provide reliable, grounded information
Quality Assurance: Maintain high standards in AI-powered applications
Risk Mitigation: Prevent propagation of misinformation
User Experience: Build confidence in AI-generated content
Research & Development: Enable systematic evaluation of model improvements
Compliance: Meet regulatory requirements for AI transparency

Detection Methods

Currently Implemented

Method	Description	Use Case	Accuracy	Speed	Status
LLM Judge	Uses judge LLMs for sentence-level grounding evaluation	General purpose, content verification	High	Medium	Complete

Planned Implementations

Based on the AWS methodology and additional research:

Method	Description	Source	Status	Expected Release
Embedding Similarity	Semantic similarity between context and response	AWS Blog	Planning	2025

Quick Start

Choose the detection method that best fits your use case:

LLM Judge

cd llm_judge
pip install -r requirements.txt
python main.py --input your_data.jsonl --provider openai --model gpt-4o-mini

Full LLM Judge Documentation

Universal Data Format

All methods in this toolkit use a standardized input format for consistency:

{"id": "unique_id", "question": "original_question", "context": "source_context", "response": "llm_response"}

This enables easy comparison and ensemble approaches across different detection methods.

Installation

Prerequisites

Python 3.8+
Virtual environment (recommended)

Setup

git clone https://github.com/thatechmaestro/hallucination-detector.git
cd hallucination-detector

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install specific method
cd llm_judge
pip install -r requirements.txt

Related Projects & References

Academic Papers

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Industry Resources

Building Trust in AI through Rigorous Hallucination Detection

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
llm_judge		llm_judge
.env		.env
.gitignore		.gitignore
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hallucination Detector

Overview

Why Hallucination Detection Matters

Detection Methods

Currently Implemented

Planned Implementations

Quick Start

LLM Judge

Universal Data Format

Installation

Prerequisites

Setup

Related Projects & References

Academic Papers

Industry Resources

About

Uh oh!

Releases

Packages

Languages

ThaTechMaestro/hallucination-detection

Folders and files

Latest commit

History

Repository files navigation

Hallucination Detector

Overview

Why Hallucination Detection Matters

Detection Methods

Currently Implemented

Planned Implementations

Quick Start

LLM Judge

Universal Data Format

Installation

Prerequisites

Setup

Related Projects & References

Academic Papers

Industry Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages