Insights Assistant — User Guide (RAG + MCP Host/Server + LLM Summarizer)

Date: 2025-09-03

Summary

This project is a local Insights Assistant. You ingest PDF/CSV files into a vector database (Chroma) via a FastAPI server. A client (CLI or Streamlit) searches those embeddings and then calls an LLM summarizer (Ollama or OpenAI) to produce a concise, cited answer grounded in retrieved snippets.

Server (mcp_server.py): exposes /tools/ingest_pdf, /tools/ingest_csv, /tools/search_docs. Uses LangChain loaders → splitter → embeddings → Chroma (persistent in ./db).
Client library (mcp_host.py): tiny HTTP wrapper with retries/backoff and friendly exceptions.
Clients:
- CLI (client_app.py): ingest-* and ask.
- Streamlit (streamlit_app.py): tabs for Ingest and Ask + preview panel.
Summarizer (summarizer.py): calls Ollama (local) or OpenAI to turn top-k snippets into a clean, cited answer. (Required in this setup.)

ASCII Flow

ASCII = r""" +------+ +--------------------+ +-------------------------+ |User |--uses---> | Streamlit UI | | CLI Client | +------+ | (streamlit_app.py) | | (client_app.py) | +----------+---------+ +-----------+-------------+ \ / \ via MCP Host (HTTP client) / v v +--------+-------------------------------+ | MCP Host (mcp_host.py) | | Retries • Backoff • Friendly errors | +-------------------+--------------------+ | | POST /tools/* v +------------+-------------+ | FastAPI Server | | (mcp_server.py) | | ingest_pdf / ingest_csv | | search_docs | +------------+-------------+ | +-----------------------------+------------------------------+ | LangChain RAG pipeline (server-side) | | Loaders -> Splitter -> Embeddings -> Chroma (./db) | +-----------+------------+-------------+--------------------+ | | | | docs | chunks | vectors v v v [files] [chunks] [persisted index]

REQUIRED (client-side): +---------------------------------------------------------------+ | summarizer.py → Ollama/OpenAI (REST) | | Produces the ONLY answer shown to users (cited, grounded). | +---------------------------------------------------------------+ """.strip("\n")

Dependencies & Ports

Python: 3.11+ (3.12 works; on 3.13 remove or place any from __future__ lines at the very top)
Pip: fastapi, uvicorn[standard], langchain, langchain-community, chromadb, pypdf, sentence-transformers, python-dotenv, requests, streamlit, pandas
External: Ollama (ollama serve; ollama pull llama3.1:8b)
Ports: Server 8799 • Streamlit 8501 • Ollama 11434

Environment Variables (.env)

DB_DIR=./db EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2

SUMMARIZER_PROVIDER=ollama OLLAMA_URL=http://127.0.0.1:11434

OLLAMA_MODEL=llama3.1:8b

If using OpenAI instead: OPENAI_API_KEY=sk-... OPENAI_MODEL=gpt-4o-mini

Load in Python near the top:

from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

## Run Order
1) Start Ollama
   ollama serve
   ollama pull llama3.1:8b

2) Start the FastAPI server (in your .venv)
   .\.venv\Scripts\Activate.ps1
   python -m uvicorn mcp_server:app --host 127.0.0.1 --port 8799 --reload

3) Start a client
   - Streamlit UI:
     .\.venv\Scripts\Activate.ps1
     python -m streamlit run streamlit_app.py  # http://localhost:8501
   - CLI:
     .\.venv\Scripts\Activate.ps1
     python client_app.py ingest-pdf .\data\doc.pdf
     python client_app.py ask "Paste a phrase from your PDF"

## Sanity Checks
- where python; python -V
- curl http://127.0.0.1:8799/
- curl http://127.0.0.1:11434/api/tags
- python client_app.py ingest-pdf .\data\doc.pdf
- python client_app.py ask "your phrase"

## Common Issues (and fixes)
Cannot reach server → start uvicorn; test /; port 8799; firewall/port conflict (netstat -ano | findstr :8799)
Wrong Python (global vs .venv) → activate .venv or run .\.venv\Scripts\python -m ...
from __future__ SyntaxError → remove or move to file top (3.11+ doesn’t need it)
Few/zero chunks → short/scanned PDF; pre-OCR or OCR loader; reduce chunk size to 500/100
LLM summary missing (required) → ollama serve, ollama pull llama3.1:8b, set OLLAMA_* envs
File not found (400) → pass a correct absolute/relative path
Proxy interferes with localhost → NO_PROXY=127.0.0.1,localhost


---

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
db		db
.gitignore		.gitignore
FastAPI-insights-assistant.code-workspace		FastAPI-insights-assistant.code-workspace
FastAPI_client.py		FastAPI_client.py
FastAPI_server.py		FastAPI_server.py
Insights_Assistant_FastAPI_User_Guide.docx		Insights_Assistant_FastAPI_User_Guide.docx
README.md		README.md
client_app_FastAPI.py		client_app_FastAPI.py
make_onepager_pdf.py		make_onepager_pdf.py
requirements.txt		requirements.txt
streamlit_app_FastAPI.py		streamlit_app_FastAPI.py
summarizer.py		summarizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Insights Assistant — User Guide (RAG + MCP Host/Server + LLM Summarizer)

Summary

ASCII Flow

Dependencies & Ports

Environment Variables (.env)

About

Uh oh!

Releases

Packages

Languages

hakant66/FastAPI-insights-assistant

Folders and files

Latest commit

History

Repository files navigation

Insights Assistant — User Guide (RAG + MCP Host/Server + LLM Summarizer)

Summary

ASCII Flow

Dependencies & Ports

Environment Variables (.env)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages