Life Insurance Support Assistant

AI-powered life insurance assistant using LangGraph, RAG, and OpenAI. Provides intelligent responses about policies, coverage, premiums, eligibility, and claims through CLI and REST API.

Overview

This system uses a multi-stage LangGraph agent with Retrieval-Augmented Generation to answer life insurance queries with domain-specific knowledge. The agent classifies intent, retrieves relevant context from a vector store, executes specialized tools when needed, and generates accurate responses maintaining conversational context.

Architecture

Agent Pipeline (LangGraph StateGraph):

Intent Classification - LLM-based categorization with configurable temperature
Knowledge Retrieval - RAG search with multi-turn conversation context
Tool Selection - Intelligent LLM-based tool routing (not keyword matching)
Tool Execution - Validated specialized tools with input constraints
Response Generation - Context-aware generation with conversation history

Specialized Services:

IntentAnalyzer - Classifies user intent into 6 categories
ContextRetriever - Retrieves relevant knowledge with conversation context
ToolSelector - LLM-powered tool selection with fallback logic
ToolExecutor - Executes premium calculator, eligibility checker, policy comparator
ResponseGenerator - Generates final answers with full context

Production Features:

Caching Layer - In-memory cache for RAG and LLM calls (60-70% cost reduction)
Rate Limiting - Token bucket algorithm, 100 req/min per client (configurable)
Monitoring - Real-time metrics tracking (requests, tokens, costs, errors)
Hot Reload - Update knowledge base without restart via API endpoint
Connection Pooling - Optimized database connections for high concurrency
Input Validation - Comprehensive validation for age, coverage, and term parameters

Tech Stack:

LangGraph + LangChain for agent orchestration
OpenAI GPT-4o-mini for reasoning (abstracted, swappable)
ChromaDB with text-embedding-3-small for vector search
FastAPI for REST API with rate limiting middleware
SQLAlchemy with SQLite/PostgreSQL for session persistence
Rich for CLI interface
In-memory caching (extensible to Redis)

Installation

Prerequisites

Python 3.10+
OpenAI API key

Local Setup

# Clone and setup
git clone git@github.com:FardinHash/lisa.git
cd lisa

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

# Initialize knowledge base
python scripts/init_knowledge_base.py

# Verify installation
pytest

Docker Setup

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

docker-compose up -d

API available at: http://localhost:8000 Documentation: http://localhost:8000/docs

Usage

CLI Interface

Start interactive chat:

python cli/chat.py

Available commands:

help - Display command reference
clear - Reset conversation history
history - Show full session transcript
new - Start new session
quit / exit - Exit application

Example queries:

What types of life insurance are available?
Calculate premium for 35 year old, $500k coverage, 20 year term
Can I qualify for insurance if I have diabetes?
Compare term life and whole life insurance
What documents do I need to file a claim?

REST API

Start API server:

uvicorn app.main:app --reload

Create Session

curl -X POST http://localhost:8000/api/v1/chat/session \
  -H "Content-Type: application/json" \
  -d '{"user_id": "optional_user_id"}'

Response:

{
  "session_id": "uuid-string",
  "created_at": "2025-01-01T00:00:00",
  "message_count": 0
}

Send Message

curl -X POST http://localhost:8000/api/v1/chat/message \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "uuid-from-above",
    "message": "What is term life insurance?"
  }'

Response:

{
  "session_id": "uuid-string",
  "message": "Term life insurance is...",
  "sources": ["policy_types.txt"],
  "agent_reasoning": "Intent: POLICY_TYPES | Tools Used: None",
  "timestamp": "2025-01-01T00:00:00"
}

Get Session History

curl http://localhost:8000/api/v1/chat/session/{session_id}

Delete Session

curl -X DELETE http://localhost:8000/api/v1/chat/session/{session_id}

Health Check

curl http://localhost:8000/health

Get System Metrics

curl http://localhost:8000/metrics

Response:

{
  "uptime_seconds": 3600,
  "uptime_formatted": "1h 0m 0s",
  "timestamp": "2025-11-15T12:00:00",
  "metrics": {
    "/api/v1/chat/message": {
      "count": 150,
      "total_time": 225.5,
      "avg_time": 1.503,
      "errors": 2
    },
    "llm_gpt-4o-mini": {
      "count": 180,
      "total_time": 180.2,
      "avg_time": 1.001,
      "total_tokens": 45000,
      "total_cost": 0.23
    },
    "rag_search": {
      "count": 165,
      "total_time": 8.3,
      "avg_time": 0.050,
      "total_results": 495
    }
  }
}

Reload Knowledge Base (Admin)

curl -X POST http://localhost:8000/admin/reload-knowledge-base

Response:

{
  "success": true,
  "message": "Knowledge base reloaded successfully with 125 chunks",
  "chunks": 125,
  "timestamp": "2025-11-15T12:00:00"
}

Rate Limit Headers

All API responses include rate limiting information:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1700050800

Configuration

All settings via environment variables (.env file):

Core Settings

Variable	Description	Default
`OPENAI_API_KEY`	OpenAI API key (required)	-
`ENVIRONMENT`	Deployment environment	`local`
`LOG_LEVEL`	Logging level	`INFO`
`LLM_MODEL`	OpenAI model	`gpt-4o-mini`
`LLM_TEMPERATURE`	Response creativity	`0.7`
`LLM_MAX_TOKENS`	Max response length	`800`
`EMBEDDING_MODEL`	Embedding model	`text-embedding-3-small`

Database Settings

Variable	Description	Default
`DATABASE_URL`	Session database	`sqlite:///./data/conversations.db`
`DB_POOL_SIZE`	Connection pool size	`10`
`DB_MAX_OVERFLOW`	Max overflow connections	`20`
`DB_POOL_TIMEOUT`	Connection timeout (sec)	`30`
`DB_POOL_RECYCLE`	Connection recycle time (sec)	`3600`

RAG Settings

Variable	Description	Default
`CHROMA_PERSIST_DIR`	Vector store path	`./data/chroma_db`
`KNOWLEDGE_BASE_DIR`	Source documents	`./knowledge_base`
`RAG_CHUNK_SIZE`	Document chunk size	`1000`
`RAG_CHUNK_OVERLAP`	Chunk overlap	`200`
`RAG_SEARCH_K`	Retrieved documents	`3`
`RAG_SCORE_THRESHOLD`	Min relevance score	`0.5`

Agent Settings

Variable	Description	Default
`INTENT_CLASSIFICATION_TEMPERATURE`	Intent classification temp	`0.3`
`TOOL_SELECTION_TEMPERATURE`	Tool selection temp	`0.2`
`MEMORY_MAX_HISTORY`	Max messages stored	`10`
`MEMORY_CONTEXT_MESSAGES`	Context messages for retrieval	`4`

Validation Settings

Variable	Description	Default
`TOOL_AGE_MIN`	Minimum age	`18`
`TOOL_AGE_MAX`	Maximum age	`85`
`TOOL_COVERAGE_MIN`	Min coverage amount	`10000`
`TOOL_COVERAGE_MAX`	Max coverage amount	`10000000`
`TOOL_TERM_MIN`	Min term length (years)	`5`
`TOOL_TERM_MAX`	Max term length (years)	`40`

Performance Settings

Variable	Description	Default
`CACHE_ENABLED`	Enable caching	`true`
`CACHE_TTL`	Cache TTL (seconds)	`3600`
`CACHE_MAX_SIZE`	Max cache entries	`1000`
`RATE_LIMIT_ENABLED`	Enable rate limiting	`true`
`RATE_LIMIT_CALLS`	Calls per period	`100`
`RATE_LIMIT_PERIOD`	Period in seconds	`60`

API Settings

Variable	Description	Default
`API_HOST`	API bind address	`0.0.0.0`
`API_PORT`	API port	`8000`
`API_RELOAD`	Auto-reload on changes	`true`

For production:

Set ENVIRONMENT=production
Use PostgreSQL: DATABASE_URL=postgresql://user:pass@host:5432/db
Reduce logging: LOG_LEVEL=WARNING
Adjust rate limits based on load
Consider Redis for distributed caching
Set API_RELOAD=false

Project Structure

lisa/
├── app/
│   ├── main.py                  # FastAPI application with middleware
│   ├── config.py                # Comprehensive Pydantic settings
│   ├── models.py                # Request/response models
│   ├── database.py              # SQLAlchemy ORM with connection pooling
│   ├── api/
│   │   └── chat.py              # REST endpoints
│   ├── agents/
│   │   ├── graph.py             # LangGraph agent workflow (refactored)
│   │   ├── services.py          # Separated agent service classes
│   │   ├── tools.py             # Validated specialized tools
│   │   └── prompts.py           # Prompt templates
│   ├── middleware/
│   │   ├── __init__.py
│   │   └── rate_limit.py        # Rate limiting middleware
│   └── services/
│       ├── llm.py               # LLM service (uses llm_provider)
│       ├── llm_provider.py      # Abstracted LLM provider layer
│       ├── rag.py               # ChromaDB with caching & hot reload
│       ├── memory.py            # Session management (refactored)
│       ├── cache.py             # Caching service
│       └── monitoring.py        # Metrics and monitoring
├── cli/
│   └── chat.py                  # Rich CLI interface
├── knowledge_base/              # Life insurance documents
│   ├── policy_types.txt
│   ├── eligibility_underwriting.txt
│   ├── claims_process.txt
│   ├── risk_assessment_criteria.txt
│   └── faq.txt
├── scripts/
│   └── init_knowledge_base.py   # Vector store initialization
├── tests/
│   ├── unit/                    # Unit tests
│   └── integration/             # Integration tests
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── .env.example

Testing

Run test suite:

# All tests
pytest

# With coverage report
pytest --cov=app --cov-report=html

# Specific test categories
pytest tests/unit/              # Unit tests only
pytest tests/integration/       # Integration tests only

# Single test file
pytest tests/unit/test_rag_service.py

# View coverage
open htmlcov/index.html

Code formatting:

black app/ tests/ cli/
isort app/ tests/ cli/

Current coverage: 85%+

Agent Workflow Details

State Machine Flow:

User Query
    ↓
[Analyze Intent]
    ↓
[Retrieve Information] → RAG Search (ChromaDB)
    ↓
Should Use Tools? ← Intent + Keywords
    ├─ Yes → [Use Tools] → Premium Calc / Eligibility Check / Policy Compare
    └─ No  →
    ↓
[Generate Answer] ← Context + Tool Results + History
    ↓
Response + Sources + Reasoning

Intent Categories:

POLICY_TYPES - Policy types, features, comparisons
ELIGIBILITY - Qualification criteria, underwriting
PREMIUMS - Cost calculations, pricing factors
CLAIMS - Filing process, beneficiaries, documentation
COVERAGE - Coverage amounts, limitations, riders
GENERAL - General inquiries, greetings

Tool Activation Logic:

Premium calculator: Keywords like "calculate", "estimate", "cost", "how much" OR intent PREMIUMS
Eligibility checker: Keywords like "eligible", "qualify", "approved" OR intent ELIGIBILITY
Policy comparator: Keywords like "compare", "versus", "difference"

Knowledge Base

The system includes 5 curated documents covering:

Life insurance policy types (term, whole, universal, variable)
Eligibility and underwriting criteria
Claims filing process and documentation
Risk assessment and rating factors
Frequently asked questions

To update knowledge base:

Add/modify .txt files in knowledge_base/
Run: python scripts/init_knowledge_base.py
Restart application

Deployment

Production Checklist

Docker Production Deploy

# Build image
docker-compose build

# Start services
docker-compose up -d

# View logs
docker-compose logs -f api

# Health check
curl http://localhost:8000/health

# Stop services
docker-compose down

Database Migration

SQLite (local) to PostgreSQL (production):

# Export sessions from SQLite
sqlite3 data/conversations.db .dump > backup.sql

# Update .env with PostgreSQL URL
DATABASE_URL=postgresql://user:pass@host:5432/dbname

# Application auto-creates tables on startup

Troubleshooting

Vector store not initialized

python scripts/init_knowledge_base.py

OpenAI API errors

Verify OPENAI_API_KEY in .env
Check API key permissions and quota
Confirm network connectivity

Import/dependency errors

pip install --upgrade -r requirements.txt

Database locked (SQLite)

Ensure only one process accessing database
Consider switching to PostgreSQL for production

Poor response quality

Adjust LLM_TEMPERATURE (0.3-0.9)
Increase RAG_SEARCH_K for more context
Review and improve knowledge base documents

Memory leaks in long sessions

Sessions auto-limit to MEMORY_MAX_HISTORY messages
Use /api/v1/chat/session/{id} DELETE endpoint to clear old sessions

License

MIT License - See LICENSE file for details

Support

For issues or questions:

Check troubleshooting section above
Review logs with LOG_LEVEL=DEBUG
Create GitHub issue with reproduction steps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Life Insurance Support Assistant

Overview

Architecture

Installation

Prerequisites

Local Setup

Docker Setup

Usage

CLI Interface

REST API

Configuration

Core Settings

Database Settings

RAG Settings

Agent Settings

Validation Settings

Performance Settings

API Settings

Project Structure

Testing

Agent Workflow Details

Knowledge Base

Deployment

Production Checklist

Docker Production Deploy

Database Migration

Troubleshooting

License

Support

About

Uh oh!

Releases 5

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
app		app
cli		cli
knowledge_base		knowledge_base
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.isort.cfg		.isort.cfg
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

License

FardinHash/lisa

Folders and files

Latest commit

History

Repository files navigation

Life Insurance Support Assistant

Overview

Architecture

Installation

Prerequisites

Local Setup

Docker Setup

Usage

CLI Interface

REST API

Configuration

Core Settings

Database Settings

RAG Settings

Agent Settings

Validation Settings

Performance Settings

API Settings

Project Structure

Testing

Agent Workflow Details

Knowledge Base

Deployment

Production Checklist

Docker Production Deploy

Database Migration

Troubleshooting

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

Packages