A RAG pipeline implementation built on the 'Epstein Files 20K' dataset from Hugging Face (Teyler).
-
Updated
Feb 14, 2026 - Python
A RAG pipeline implementation built on the 'Epstein Files 20K' dataset from Hugging Face (Teyler).
Open source document processing pipeline for the Epstein case files. Download OCR, extract entities, deduplicate and export documents from the DOJ Releases
Play Bad Apple and DOOM on redacted Epstein files and other documents. Implemented using KNN and feature vector.
Chat with the Epstein Case files
Download all Epstein files, images, pdfs, and more!
downloads .pdf files from DOJ website / epstein data-sets
A data-driven audit of the 'Geopolitical Thermostat,' documenting how timed information disclosure regulates public attention to enable structural shifts in policy and capital flows.
Download all of the Jeffrey Epstein court records with this Python script! Mirror of https://git.graveyard.sh/OfficialB/doj-epstein-crawler
Epstein files API MCP
🔍 Search Epstein court documents for mentions of your LinkedIn connections to uncover relevant insights into your network.
A Python tool to automatically download and archive DOJ Epstein disclosure datasets, with support for resuming from a specific page and skipping already downloaded files. Designed for easy personal archiving and research purposes.
Ask questions about the Epstein Files using AI - A RAG pipeline with hybrid search, re-ranking, and Streamlit UI built on the Epstein Files 20K dataset.
Download documents from the Epstein Files 2026. CVS Direct Downloader or via web.
This is a leisure project to search pdf files for specific information using LLMs.
Open-source extraction pipeline behind aretheyinvolved.com: multi-stage NER + LLM classification across 1.5M+ pages of publicly released Epstein documents.
discord bot that permit users search to epstein files
CLI tool to search, download, and analyze DOJ Epstein case documents
This analysis employs established methodologies from computational social science, legal informatics, and criminal network analysis, with all claims directly supported by primary Epstein Files Transparency Act (EFTA) document citations. It provides an overview of the materials suitable for use by researchers, journalists, and legal professionals.
Unofficial python wrapper for Epstein Exposed API
Analytical semantic platform for the Jeffrey Epstein files — NLP entity extraction, relationship mapping, redaction analysis, and vector search over 1.4M+ government documents
Add a description, image, and links to the epstein-files topic page so that developers can more easily learn about it.
To associate your repository with the epstein-files topic, visit your repo's landing page and select "manage topics."