Auto Distill

Auto Distill is an AI-powered agentic pipeline designed to generate high-quality, "Distill-style" interactive articles. Whether starting from a simple user query or an uploaded PDF, Auto Distill orchestrates a team of AI agents to research, plan, write, and visualize complex scientific and technical concepts.

DEMO VIDEO

LIVE DEMO

Features

Agentic Workflow: Powered by LangGraph, the system employs specialized agents:
- Know-It-All: Researches topics using Arxiv and Wikipedia.
- Planner: Creates a structured "Story Arc" for the article.
- Miner: Extracts specific data and insights for each chapter.
- Coder: Generates interactive visualizations using D3.js or Three.js. It references a local vector database of D3.js and Three.js documentation to ensure accurate code generation.
- Critic: Reviews and validates the generated code.
- Video Agent: Finds or generates relevant videos using MCP tools.
- Writer: Drafts engaging, educational content in HTML.
Interactive Visualizations: Automatically generates custom D3.js or Three.js visualizations to explain concepts.
PDF Ingestion: Upload your own research papers or documents. The system uses Docling with hybrid chunking to intelligently parse and index the content.
MCP Integration: Utilizes the Model Context Protocol (MCP) to connect with external tools like video generators.
Gradio UI: A user-friendly web interface to interact with the system and view results.

Installation

Clone the repository:

git clone <repository-url>
cd auto_distill

Install dependencies:

It is recommended to use a virtual environment.
```
pip install -r requirements.txt
```
Set up Environment Variables:

You need to set your Google Gemini API key. You can do this by setting an environment variable GEMINI_KEY.
```
# Linux/macOS
export GEMINI_KEY="your_gemini_api_key"

# Windows (PowerShell)
$env:GEMINI_KEY="your_gemini_api_key"
```

Usage

Run the application:
```
python app.py
```
Access the UI:

Open your web browser and navigate to the URL provided in the terminal (usually http://127.0.0.1:7860).
Generate Articles:
- Run from Query: Go to the "Run from Query" tab, enter a topic (e.g., "Graph Neural Networks"), and click "Run Agent".
- Run from PDF: Go to the "Run from PDF" tab, upload a PDF file, and click "Ingest + Generate".
View Results:

The generated HTML articles will be saved in the outputs/ directory and can be previewed directly in the "Browse Outputs" tab.

Project Structure

app.py: The main entry point for the application, containing the Gradio UI logic.
src/agent_pipeline.py: Defines the LangGraph workflow, agent nodes, and state management.
tools/: Contains custom tools and MCP client configurations.
- mcp_tools.py: Configuration for the Multi-Server MCP Client.
- custom_tools.py: Custom tools for search, vector DB queries, etc.
utils.py: Utility functions for file handling and vector store operations.
requirements.txt: List of Python dependencies.
outputs/: Directory where generated HTML files and assets are stored.
chroma_db_native/ & data/: Directories for local data storage and vector databases.

Technologies Used

LangChain & LangGraph: For building the agentic workflow.
Google Gemini: As the primary LLM for reasoning and content generation.
Gradio: For the web interface.
ChromaDB: For vector storage and retrieval.
Docling: For document parsing and ingestion.
Model Context Protocol (MCP): For extensible tool integration.

License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Auto Distill

Features

Installation

Usage

Project Structure

Technologies Used

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
chroma_db_native		chroma_db_native
outputs		outputs
src		src
tools		tools
.gitignore		.gitignore
README.md		README.md
app.py		app.py
auto_distill.gif		auto_distill.gif
requirements.txt		requirements.txt
utils.py		utils.py

ya0002/auto_distill

Folders and files

Latest commit

History

Repository files navigation

Auto Distill

Features

Installation

Usage

Project Structure

Technologies Used

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages