A complete, end-to-end, modular recommendation system for news articles. It simulates or ingests data, preprocesses and engineers features, trains both classical and deep-learning recommenders, evaluates them offline, simulates A/B testing, and serves real-time recommendations via FastAPI.
- Data simulation: news metadata and user interactions (clicks, time, likes)
- Preprocessing & EDA: cleaning, normalization, notebook walkthrough
- Feature engineering: TF-IDF (content), user profiles from history
- Classical recommenders: item-based CF (cosine), truncated-SVD MF
- Deep recommender: simple two-tower (user/item id embeddings) in PyTorch
- Offline evaluation: Precision@K, Recall@K, NDCG, MAP
- A/B testing simulator: online-style evaluation harness
- Deployment: FastAPI REST service + minimal web UI
.
├─ requirements.txt
├─ README.md
├─ notebooks/
│ └─ 01_eda.ipynb
├─ src/
│ ├─ __init__.py
│ ├─ data/
│ │ └─ simulate.py
│ ├─ preprocess.py
│ ├─ features.py
│ ├─ models/
│ │ ├─ classical.py
│ │ └─ deep.py
│ ├─ eval/
│ │ └─ metrics.py
│ ├─ abtest/
│ │ └─ simulator.py
│ └─ pipeline.py
├─ api/
│ └─ main.py
└─ frontend/
└─ index.html
- Create a virtual environment and install dependencies:
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt- Run the training pipeline to generate artifacts:
python -m src.pipelineThis simulates data, preprocesses, builds features, trains recommenders, evaluates them, and saves artifacts to artifacts/.
- Start the API server:
uvicorn api.main:app --reload- Open the minimal UI:
- Navigate to
http://127.0.0.1:8000to see recommendations for a random user.
Ingest/Simulate data → Clean & EDA → Feature engineering → Train (Classical + Deep) → Offline evaluation → A/B Simulation → Package artifacts → Serve via API → Optional frontend
notebooks/01_eda.ipynbexplains the dataset, behavior patterns, and feature distributions.
- Swap in real ingestion instead of simulation as needed.
- Baselines and deep model are simple and readable; extend as desired.
- Consider advanced embeddings (e.g., transformer text encoders) for richer content.