Skip to content

pyemma/Argo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Argo

argo

Argo, the ship that carried Jason and the Argonauts on their quest for the Golden Fleece

This is a playground to re-implement model architectures from industry/academic papers in Pytorch. The primary goal is educational and the target audience is for those who would like to start journey in machine learning & machine learning infra. The code implementation is optimized for readability and expandability, while not for the best of performance.

Repo structure

  • data: functions for dataset management, such as downloading public dataset, cache management, etc
  • embedding: scripts used for generating embedding
  • feature: functions for featuer engeering, right now primarily read data from benchmark and use Pandas to do certain feature engineer
  • get-started: some userful notebooks to help you get faimilar with common techniques & concept in machine learning and recommendation system
  • model: model code implementation
  • trainer: simple wrapper around train/val/eval loop
  • server: simple inference stack for recommendation system, including retrieval engine, feature server, model manager and inference engine
  • scripts: some scripts used for setup the system, such as DB ingestion

Prepare Step

Embedding Based Retrieval Setup

  1. run python movie_len_embedding.py to generate the embeddings (only support the collabrative embedding)
  2. run python movie_len_index.py to generate the FAISS index
  3. run python scripts/vector_db.py to ingest embedding into DuckDB

How to run locally

Using uv (Recommended)

  1. Install uv if you haven't already: curl -LsSf https://astral.sh/uv/install.sh | sh
  2. Install dependencies and the package: uv sync (this creates a virtual environment and installs everything)
  3. Activate the virtual environment: source .venv/bin/activate (or uv shell on some systems)
  4. Run python main.py to train the model with current env config.
  5. Run python -m server/ebr_server.py to start the grpc server for embedding based retrieval, it would listen on port 50051 by default; if you use DuckDB then this step could be skipped
  6. Run python server/inference_engine.py to start the inference server, it would listen on 8000 port
  7. Run bash scripts/server_request.sh to send a dummy request (there is one for DIN and one for TransAct as of now, will parameterized the request in the future)

Using pip (Legacy)

  1. Install the dependency pip install -r requirements.txt, pip install -e .
  2. Run python main.py to train the model with current env config.
  3. Run python -m server/ebr_server.py to start the grpc server for embedding based retrieval, it would listen on port 50051 by default; if you use DuckDB then this step could be skipped
  4. Run python server/inference_engine.py to start the inference server, it would listen on 8000 port
  5. Run bash scripts/server_request.sh to send a dummy request (there is one for DIN and one for TransAct as of now, will parameterized the request in the future)

Papers

Road Map

Modeling

  • ✅ Deep Interest Network E2E training & inference example, MovieLen Small
  • ✅ TransAct training & inference example, MovieLen Large
  • ✅ MovieLen item embedding generation, collaborative filtering, two-towers, LLM (QWen3-embedding is out)
  • 🚧 HSTU training & inference example, MoiveLen Small
  • ✅ RQ-VAE
  • Generative Retrieval via various strategies: NTP, MTP with semtanic ids, token represention with ANN

Data & Feature Engineering

  • ✅ Kuaishou Dataset: https://kuairand.com/
  • Ray integration (DPP reader + trainer arch)
  • Daft, Polars exploartion

Infra

  • ✅ Embedding Based Retrieval (EBR): DuckDB, FAISS
  • Nearline item embedding update
  • Feature store integration: FEAST
  • Feature logging & training data generation pipeline
  • Pytorch Lightening integration
  • Reinforcement learning training infrastructure for recommendation task

GPU

  • GPU training & inference enablement
  • Integrate profiling, benchmarking, tuning, and monitoring for accelerator optimization
  • Optimize representative models with auto-tuning, kernel fusion, quantization, dynamic batching, etc

Reference

About

A ML playground for education purpose

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •