Skip to content

Ramana-Raja/AI_Financial_Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 

Repository files navigation

AI Financial Assistant πŸ€–

A bot designed to read and analyze financial PDF reports. Users can ask questions in plain English, and it will find and deliver relevant information, eliminating the need to manually search through lengthy documents.

Features

  • Conversational Q&A: Interact with financial PDFs using natural language to get clear, concise answers
  • Advanced RAG Pipeline: Utilizes Retrieval-Augmented Generation to find and synthesize information
  • Conversational Memory: Retains context over the last six turns for complex follow-up questions
  • High-Performance Models: Powered by google/flan-t5-base for generation and all-MiniLM-L6-v2 for semantic retrieval
  • Extendable: Fine-tune models on your own data to improve performance for specific domains

How It Works

The application uses a Retrieval-Augmented Generation (RAG) architecture:

  1. PDF Ingestion: Extracts all text from the source PDF
  2. Chunking: Segments text into smaller, overlapping chunks to maintain context
  3. Embedding: Converts text chunks into numerical vectors (embeddings)
  4. Retrieval: Finds most relevant text chunks when a question is asked
  5. Synthesis: Generates a final, human-like answer using the FLAN-T5 model

Tech Stack

  • Core: Python 3.9+
  • AI Frameworks: PyTorch, Hugging Face (transformers, sentence-transformers)
  • PDF Processing: PyMuPDF (fitz)
  • Text Utilities: NLTK

πŸš€ Usage

1. Set your PDF path

pdf_path = "your_document.pdf"

2. Run The File

python main.py

πŸ“ Project Structure

Azure
β”œβ”€β”€ modules_download
β”‚   └── nlt.py #to download punkt
β”‚
β”œβ”€β”€ testing #contains some testings(for personal test)
β”‚
β”œβ”€β”€ training_model
β”‚   β”œβ”€β”€ inference.py #testing with a single prompt
β”‚   β”œβ”€β”€ train_model_answers.py #for training FLAN-T5
β”‚   └── train_new_knowledge.py #for training all-MiniLM-L6-v2
β”‚     
└── main.py #main file which requires pdf

Road Map

Web interface development

Additional document format support

Advanced agent capabilities

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages