Ecommerce Purchase Intent Prediction & Product Recommendations

A complete, from‑scratch demo system that simulates e‑commerce activity, predicts purchase intent at the session level, and generates product recommendations. It includes an end‑to‑end Python pipeline and a Streamlit app to explore live results.

What this project does

Simulates realistic browsing/transactions with session‑level events and a product catalog
Cleans data, runs a concise EDA, and engineers user/product/time features
Trains a purchase‑intent model (XGBoost or LightGBM) with cross‑validation
Builds recommenders: collaborative filtering (Surprise/SVD) and content‑based (SentenceTransformers)
Orchestrates everything with a simple pipeline and serves a Streamlit demo
Reports metrics (OOF AUC for classification). Recommendation metrics hooks are easy to extend.

Project structure

.
├─ app/
│  └─ streamlit_app.py           # Frontend demo
├─ artifacts/
│  ├─ figures/                   # Plots (created at run)
│  ├─ metrics/                   # Metrics JSON/CSVs
│  └─ models/                    # Trained model(s)
├─ data/
│  ├─ raw/                       # Simulated catalog/sessions
│  └─ processed/                 # Post‑clean/feature data
├─ scripts/
│  └─ run_pipeline.py            # Run end‑to‑end pipeline
├─ src/
│  ├─ data/
│  │  ├─ processing.py           # cleaning, EDA, features
│  │  └─ __init__.py
│  ├─ models/
│  │  ├─ intent.py               # intent training + CV
│  │  └─ __init__.py
│  ├─ pipeline/
│  │  ├─ orchestrator.py         # simulate→process→train→reco
│  │  └─ __init__.py
│  ├─ reco/
│  │  ├─ collab.py               # Surprise SVD CF
│  │  ├─ content.py              # SentenceTransformers content‑based
│  │  └─ __init__.py
│  ├─ simulation/
│  │  ├─ data_generator.py       # catalog + session simulator
│  │  └─ __init__.py
│  └─ utils/
│     ├─ config.py               # Paths manager
│     ├─ logging_utils.py        # Logger helper
│     └─ __init__.py
└─ requirements.txt

How to run

End‑to‑end pipeline (simulate, process, train, recommend):

python scripts/run_pipeline.py

Streamlit app (first run will auto‑train if needed):

streamlit run app/streamlit_app.py

Workflow (high level)

Data Simulation
- Build a catalog with brand/category/price/rating and simple text.
- Generate sessions: events, dwell times, segment/geo, and purchase outcomes with controllable conversion rate.
Cleaning & EDA
- Basic sanitization, timestamp parsing, and summary stats by segment/geo.
Feature Engineering
- Numerical features (events, dwell, price stats, tenure) and time features (hour, day‑of‑week) + one‑hot categorical encodings.
Model Training
- XGBoost or LightGBM with Stratified K‑Fold; reports Out‑of‑Fold (OOF) AUC; saves final model.
Recommendations
- Collaborative filtering (Surprise SVD) on implicit feedback derived from sessions.
- Content‑based using SentenceTransformers embeddings of product text for similar‑item suggestions.
Serving (Demo)
- Streamlit app shows recent sessions for a user, intent scores, and a simple popularity‑weighted top‑K list.

Metrics & Visualizations

Classification: OOF AUC reported in console/logs (extend with ROC curves, calibration plots, feature importance, etc.).
Recommendation: Precision@K/Recall@K scaffolding can be added by holding out some interactions and scoring the recommenders.

Notes & Extensibility

The simulator is parameterized; tweak catalog size, sessions, conversion rate in ECommerceSimulator configs.
Swap in different intent models or hyperparameter search (Optuna/Sklearn GridSearch).
Add image/text enrichment to catalog, or integrate LightFM for hybrid recommenders.
For real‑time serving, wrap the trained model in a small API (Flask/FastAPI) and cache recent embeddings for speed.

Disclaimer

This is a compact educational demo. The data is simulated and not representative of a real store’s nuances.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ecommerce Purchase Intent Prediction & Product Recommendations

What this project does

Project structure

How to run

Workflow (high level)

Metrics & Visualizations

Notes & Extensibility

Disclaimer

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
app		app
scripts		scripts
src		src
README.md		README.md
requirements.txt		requirements.txt

oliveira-mtcode/e-commerce-purchase-intent

Folders and files

Latest commit

History

Repository files navigation

Ecommerce Purchase Intent Prediction & Product Recommendations

What this project does

Project structure

How to run

Workflow (high level)

Metrics & Visualizations

Notes & Extensibility

Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages