Skip to content

akv2011/Real-Time-Rag-Search-and-OCR-plot-of-maritime-Reports-in-Radar-map

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Real-Time RAG Search and OCR Plot of Maritime Reports in Radar Map

Overview

This project is a maritime surveillance system that processes reports, potentially including OCR for image-based reports, uses a Retrieval Augmented Generation (RAG) model for contextual information search, and plots relevant maritime contacts in real-time on a map. It also features a radar-like chart to display certain characteristics of the detected contacts.

Features

  • Real-Time Updates: Maritime contacts are updated and displayed on the map in real-time via WebSockets.
  • OCR Capabilities: (Inferred) Ability to process image-based reports using OCR (Optical Character Recognition) with Tesseract.
  • RAG Search: (Inferred) Utilizes a Retrieval Augmented Generation model (potentially using Langchain and Transformers) to search and provide context from a knowledge base of naval data.
  • Interactive Map Display: Uses Leaflet.js to plot maritime contacts with details available on popups.
  • Radar Chart: Displays a summary or characteristics of contacts using Chart.js.
  • Backend API: Built with FastAPI, serving data and handling WebSocket connections.
  • Structured Data Ingestion: Processes data from Markdown files containing naval exercise areas, operational zones, and maritime reports.

Tech Stack

  • Backend:
    • Python
    • FastAPI (for API and WebSockets)
    • Uvicorn (ASGI server)
    • Pytesseract & Pillow (for OCR)
    • Transformers & Langchain (for RAG and NLP - inferred from initial-config.txt)
    • SQLite (as per code/backend/requirements.txt, though initial-config.txt suggests PostgreSQL and Redis might be intended for a fuller setup)
    • WebSockets
  • Frontend:
    • HTML5
    • CSS3
    • JavaScript
    • Leaflet.js (for interactive maps)
    • Chart.js (for radar chart)
  • Database & Cache (Potential/Full Setup based on initial-config.txt):
    • PostgreSQL
    • Redis

Project Structure

Real-Time-Rag-Search-and-OCR-plot-of-maritime-Reports-in-Radar-map/
├── code/
│   ├── backend/
│   │   ├── main.py         # FastAPI application
│   │   └── requirements.txt  # Backend Python dependencies
│   ├── frontend/
│   │   ├── index.html      # Main HTML page
│   │   └── static/
│   │       ├── app.js      # Frontend JavaScript logic
│   │       ├── style.css   # Main styles
│   │       └── popup.css   # Styles for map popups
│   ├── rag/
│   │   └── naval_data/     # Knowledge base for RAG (Markdown files)
│   └── reports/
│       └── maritime-dataset-v1.md # Example maritime reports
├── initial-config.txt      # Contains initial dependency list and env vars
└── README.md               # This file

Setup and Installation

Prerequisites

  • Python 3.8+
  • Tesseract OCR:
    • Install Tesseract OCR engine on your system.
      • Linux (Debian/Ubuntu): sudo apt-get install tesseract-ocr
      • macOS: brew install tesseract
      • Windows: Download installer from the official Tesseract GitHub page.
    • Ensure tesseract command is in your system's PATH.
  • (Optional, for full setup as per initial-config.txt) PostgreSQL server, Redis server.

Backend Setup

  1. Clone the repository (if you haven't already):

    git clone <your-repository-url>
    cd Real-Time-Rag-Search-and-OCR-plot-of-maritime-Reports-in-Radar-map
  2. Navigate to the backend directory:

    cd code/backend
  3. Create and activate a virtual environment:

    python3 -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  4. Install Python dependencies: The code/backend/requirements.txt lists core dependencies. For a more comprehensive setup including database and advanced NLP features, refer to initial-config.txt and ensure those dependencies are installed.

    pip install -r requirements.txt
    # Potentially install additional dependencies from initial-config.txt if needed
    # pip install langchain transformers sqlalchemy psycopg2-binary redis python-jose passlib pydantic pydantic-settings
  5. Environment Variables (Recommended): Create a .env file in the code/backend directory based on the initial-config.txt if you plan to use PostgreSQL, Redis, or specific model configurations. Example .env content:

    # DATABASE_URL=postgresql://user:password@localhost:5432/maritime
    # REDIS_URL=redis://localhost:6379/0
    # SECRET_KEY=your-super-secret-key
    # ALGORITHM=HS256
    # ACCESS_TOKEN_EXPIRE_MINUTES=30
    MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2 # Example model

    The application might need to be configured to read these (e.g., using pydantic-settings).

Frontend Setup

No specific build steps are required for the frontend as it consists of static files.

Running the Application

  1. Start the Backend Server: Navigate to the code/backend directory (if not already there) and ensure your virtual environment is activated.

    cd /path/to/Real-Time-Rag-Search-and-OCR-plot-of-maritime-Reports-in-Radar-map/code/backend
    source venv/bin/activate # Or venv\Scripts\activate on Windows
    uvicorn main:app --reload --host 0.0.0.0 --port 8000

    The backend server should now be running on http://localhost:8000.

  2. Access the Frontend: Open the code/frontend/index.html file in your web browser.

    • You can typically do this by navigating to the file in your file explorer and double-clicking it, or by using a live server extension in your IDE.
    • Example URL if served locally: file:///path/to/Real-Time-Rag-Search-and-OCR-plot-of-maritime-Reports-in-Radar-map/code/frontend/index.html or http://localhost:<port_if_using_live_server>/code/frontend/index.html.

    The frontend will attempt to connect to the WebSocket server at ws://localhost:8000/ws.

How It Works (High-Level)

  1. Data Source: The system uses maritime reports (e.g., from code/reports/maritime-dataset-v1.md) and a knowledge base of naval information (e.g., in code/rag/naval_data/).
  2. OCR Processing: If reports are image-based, Tesseract OCR is used to extract text.
  3. Information Extraction & RAG: Extracted information and queries are processed. A RAG model, likely using sentence transformers, searches the naval data knowledge base to augment understanding and provide context.
  4. Backend Logic: The FastAPI backend processes incoming data, performs RAG searches, and manages maritime contacts.
  5. Real-Time Communication: New or updated contact information is sent to connected clients via WebSockets.
  6. Frontend Display:
    • app.js receives data through WebSockets.
    • Leaflet.js is used to plot contacts on an interactive map, showing details like type, speed, significance, and description in popups.
    • Chart.js is used to render a radar chart, possibly showing the distribution or significance levels of different contact types.

Data

  • Naval Knowledge Base (code/rag/naval_data/): Contains Markdown files with JSON objects describing various naval zones, exercise areas, and operational details. This data is likely used by the RAG system.
    • Example files: example-data.md, indian-navy-operation-zones_01-15.md, etc.
  • Maritime Reports (code/reports/): Contains example maritime reports, such as maritime-dataset-v1.md, which includes reconnaissance notes and communication messages with geographical coordinates.

Future Enhancements (Suggestions)

  • User authentication and authorization.
  • Persistent storage for reports and processed data (e.g., using the PostgreSQL setup from initial-config.txt).
  • More sophisticated OCR error handling and correction.
  • Advanced RAG query capabilities.
  • User interface for uploading new reports or images for OCR.
  • Filtering and searching capabilities on the map.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published