Skip to content

This application is a full-stack document indexing and retrieval system that allows users to upload documents, index their content, and perform natural language queries against the indexed documents. It utilizes LlamaIndex, OpenAI embeddings, and a modern React frontend to provide an interactive experience for semantic search and document retrieval

License

Notifications You must be signed in to change notification settings

extrawest/fullstack_llamaindex_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Document Indexing and Query Application

Maintenance Maintainer Ask Me Anything ! License GitHub release

πŸ“ Overview

This application is a full-stack document indexing and retrieval system that allows users to upload documents, index their content, and perform natural language queries against the indexed documents. It utilizes LlamaIndex, OpenAI embeddings, and a modern React frontend to provide an interactive experience for semantic search and document retrieval.

πŸš€ Features

  • Document Upload: Upload text files to be indexed and stored
  • Document Management: View a list of all uploaded documents
  • Semantic Search: Ask questions in natural language and get AI-generated answers
  • Source References: View the source documents and passages used to generate answers
  • Real-time Responses: Asynchronous processing with live feedback

πŸ—οΈ Architecture

The application follows a multi-service architecture:

Backend Components:

  1. Index Server (index_server.py)

    • Core document processing and indexing logic
    • Uses LlamaIndex and OpenAI embeddings for semantic understanding
    • Maintains a vector store index for document retrieval
    • Exposes services via a BaseManager server on port 5602
  2. API Server (flask_demo.py)

    • Flask-based REST API
    • Handles document upload, query requests, and document listing
    • Communicates with the index server using BaseManager client
    • Exposes endpoints on port 5601

Frontend Components:

  1. React Application (react_frontend/)
    • TypeScript React application
    • Responsive UI for document management and querying
    • Components for uploading, viewing documents, and querying the index

πŸ”§ Technologies Used

Backend

  • Python 3.11
  • Flask: Web framework for API endpoints
  • LlamaIndex: Document indexing and retrieval library
  • OpenAI API: For embeddings and LLM capabilities
  • Multiprocessing: For inter-process communication

Frontend

  • React 18: UI framework
  • TypeScript: Type-safe JavaScript
  • SCSS: Styling
  • React Spinners: Loading indicators
  • ClassNames: Conditional class application

πŸ“‹ Prerequisites

  • Python 3.11+
  • Node.js 16+
  • OpenAI API key

πŸ› οΈ Installation

Manual Installation

  1. Clone the repository

  2. Set up Python environment

    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
  3. Set up React frontend

    cd react_frontend
    npm install
  4. Set OpenAI API key

    export OPENAI_API_KEY="your-api-key-here"
  5. Start the application

    ./launch_app.sh

Docker Installation

  1. Build the Docker image

    docker build -t document-indexer .
  2. Run the container

    docker run -p 5601:5601 -p 3000:3000 -e OPENAI_API_KEY="your-api-key-here" document-indexer

πŸ’» Usage

  1. Access the application

    • Open a web browser and navigate to http://localhost:3000
  2. Upload documents

    • Use the upload area to select and upload text files
    • Check the document list to verify successful uploads
  3. Query the index

    • Type a natural language question in the query box
    • Press Enter to submit the query
    • View the AI-generated answer and source references

πŸ“š API Endpoints

  • GET /query?text=<query_text>

    • Submit a query to the index
    • Returns the answer text and source references
  • POST /uploadFile

    • Upload a document for indexing
    • Form data: file (document file), filename_as_doc_id (optional)
  • GET /getDocuments

    • Retrieve the list of indexed documents

πŸ§ͺ Development

Structure

.
β”œβ”€β”€ documents/                 # Document storage directory
β”œβ”€β”€ react_frontend/            # React frontend application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ apis/              # API client code
β”‚   β”‚   β”œβ”€β”€ components/        # React components
β”‚   β”‚   └── ...
β”‚   └── ...
β”œβ”€β”€ saved_index/               # Persisted vector index storage
β”œβ”€β”€ flask_demo.py              # Flask API server
β”œβ”€β”€ flask_simple_demo.py       # Simplified Flask demo
β”œβ”€β”€ index_server.py            # LlamaIndex processing server
β”œβ”€β”€ launch_app.sh              # Application startup script
β”œβ”€β”€ requirements.txt           # Python dependencies
└── Dockerfile                 # Docker configuration

Running in development mode

  1. Start the index server:

    python index_server.py
  2. Start the Flask API server:

    python flask_demo.py
  3. Start the React development server:

    cd react_frontend
    npm start

πŸ”’ Security Considerations

  • The application uses a hardcoded password for the BaseManager server
  • No user authentication is implemented in this version
  • Data is stored locally in the file system

About

This application is a full-stack document indexing and retrieval system that allows users to upload documents, index their content, and perform natural language queries against the indexed documents. It utilizes LlamaIndex, OpenAI embeddings, and a modern React frontend to provide an interactive experience for semantic search and document retrieval

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published