Claims Extractor

🚗 LLM-powered structured data extraction from messy, informal car accident descriptions.

Transform chaotic user input into clean JSON — a task impossible with regex or SQL.

The Problem

Insurance claim descriptions are messy:

"had an accident on av libertador yesterday a ford fiesta scratched my honda civic need a tow"

The Solution

An LLM extracts structured data:

{
  "date": "2024-03-18",
  "location": "Av. Libertador",
  "insured_vehicle": "Honda Civic",
  "third_party_vehicle": "Ford Fiesta",
  "liability": "third_party"
}

How It Works

Fuzzer → Synthetic Claims → LLM (Llama 3.2) → Structured JSON → Validator

Fuzzing generates noisy test data (typos, slang, missing punctuation)
LLM Processing extracts and normalizes entities via Ollama
Validation measures accuracy against ground truth

Results

Field	Accuracy
Location	100%
Vehicles	98%
Liability	98%
Date	76%*

*Date errors due to relative references ("yesterday") — fixable with context injection.

Quick Start

# Prerequisites: Python 3, Ollama running with Llama 3.2
ollama pull llama3.2

# Create custom model with system prompt
ollama create claims-extractor -f Modelfile

# Generate test data
python3 fuzzing/generate_claims.py

# Run extraction + validation
python3 src/process_claims.py

Test the Model Directly

echo 'ayer choque en av libertador un ford fiesta me pego atras tengo un honda civic' | ollama run claims-extractor

Tech Stack

Python 3 — Core language
Ollama + Llama 3.2 — Local LLM inference
JSONL — Data format

Project Structure

├── fuzzing/generate_claims.py  # Synthetic data generator
├── src/process_claims.py       # LLM extraction pipeline
├── src/validate_results.py     # Accuracy metrics
└── data/                       # Input/output datasets

Built for an AI course — demonstrating NLP concepts with Transformers in a practical application.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
fuzzing		fuzzing
src		src
Modelfile		Modelfile
README.md		README.md
analysis.md		analysis.md
metrics_report.md		metrics_report.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Claims Extractor

The Problem

The Solution

How It Works

Results

Quick Start

Test the Model Directly

Tech Stack

Project Structure

About

Uh oh!

Languages

JuaniV2002/claims-extractor

Folders and files

Latest commit

History

Repository files navigation

Claims Extractor

The Problem

The Solution

How It Works

Results

Quick Start

Test the Model Directly

Tech Stack

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages