A containerized AI-powered web assistant. This solution leverages a microservices architecture to crawl university data, process it via NLP pipelines, and generate context-aware responses using local Large Language Models (LLMs).
The system is composed of the following Docker services:
webapp: Node.js/Express/TypeScript frontend (MVC) with Bootstrap 5. Handles user interaction, request concurrency locking, and dynamic dashboard rendering.ai-service: Python/FastAPI backend. Orchestrates the NLP pipeline using LangChain and integrates with Google Gemini API. It handles:- Classification: Scikit-learn Logistic Regression to filter relevant queries.
- RAG (Retrieval-Augmented Generation): Summarization and QA chains powered by Qwen models via Ollama and context retrieved from PostgreSQL with
pgvector. - Text-to-SQL: Generates SQL queries for data analytics based on user input.
- Data-to-Visualization: Generates EJS/Chart.js snippets for dynamic dashboards.
db: PostgreSQL 16 withpgvectorextension. Stores RAG documents, database schema information for Text-to-SQL, and dashboard history.crawler: Python/FastAPI service usingcrawl4ai(Playwright) to fetch live content from UnivPM and populate therag_documentstable in PostgreSQL.ollama: (Optional) Containerized LLM inference server. Can be replaced by a local instance for better performance on Apple Silicon/GPU.ngrok: Exposes the application to the public internet.
- Docker Desktop
- Ollama (Recommended for local execution)
- Python 3.10+ (Optional, for local training)
Copy the example environment file and configure it:
cp .env.example .envEdit .env to set your configuration. If using Local Ollama (recommended for Mac M1/M2/M3), ensure:
OLLAMA_URL=http://host.docker.internal:11434To utilize your host's GPU and share models with the project, you must configure your local Ollama instance to listen on all interfaces and store models in the project directory.
Stop any running Ollama instance (e.g., from the menu bar), then run:
# Run this from the project root
export OLLAMA_HOST=0.0.0.0 \
export OLLAMA_MODELS=$(pwd)/models/models
# Start the server
ollama serveNote: Keep this terminal open. The
OLLAMA_MODELSpath ensures that models pulled by the project are stored within the repo structure.
If you don't have a dataset, you can generate a synthetic one with university-related questions:
# Generate 2000 samples in data/raw/training_dataset.csv
python scripts/generate_dataset.pyBefore starting the services, you must train the relevance classifier.
Option A: Using Docker (Easiest) Start the services first, then run the training script inside the container:
docker-compose up -d
docker-compose exec ai-service python scripts/train_pipeline.py --input-file "data/raw/training_dataset.csv" --text-column "question" --label-column "label" --model logistic_regressionOption B: Local Python
pip install -r src/inference/requirements.txt
python scripts/train_pipeline.py \
--input-file "data/raw/training_dataset.csv" \
--text-column "question" \
--label-column "label" \
--model logistic_regressionStart the entire stack:
docker-compose up --build- Web Interface: http://localhost:3000
- Public URL: Check the
ngrokservice logs or visit http://localhost:4040.
- Hot Reload: The
webappservice is configured withnodemon. Changes tosrc/application(TS, EJS, CSS) will trigger an automatic rebuild/restart. - Logs: View logs for specific services:
docker-compose logs -f ai-service
βββ data/ # Datasets (raw, processed)
βββ models/ # Trained .pkl models and Ollama blobs
βββ scripts/ # Training and utility scripts
βββ src/
β βββ application/ # Node.js Web App (Frontend/BFF)
β βββ crawler/ # Python Crawler Service
β βββ inference/ # Python AI/NLP Service
βββ utils/ # Shared Python utilities (Preprocessing, Logger)
βββ docker-compose.yml # Orchestration
- Classifier: Logistic Regression (Scikit-learn)
- Orchestration: LangChain (Python)
- Summarization:
qwen3:0.6b(via Ollama) - QA/Chat:
qwen3:1.7b(via Ollama)
