serving-ml

Here are 5 public repositories matching this topic...

jjiantong / Awesome-KV-Cache-Optimization

[ACL 2026] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization

machine-learning ai system computer-architecture neural-language-processing mlsys kv-cache serving-ml llm llm-serving llm-inference

Updated Apr 21, 2026
Python

clearml / clearml-serving

Star

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated Mar 12, 2026
Python

stsxxx / MoDM

Star

MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.

diffusion-models serving-ml ml-efficiency

Updated Aug 8, 2025
Python

omrylcn / serving-blueprint

Star

An async ML service built with FastAPI, Celery, RabbitMQ, and Redis for efficient, scalable ML model serving

rabbitmq ml celery fastapi serving-ml

Updated Jul 26, 2025
Python

COSS-India / dhruva-dpg

Star

Dhruva is a full-fledged DPG platform for serving AI models at scale.

open-source sustainability model dpg serving-ml

Updated Dec 19, 2025
Python

Improve this page

Add a description, image, and links to the serving-ml topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the serving-ml topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serving-ml

Here are 5 public repositories matching this topic...

jjiantong / Awesome-KV-Cache-Optimization

clearml / clearml-serving

stsxxx / MoDM

omrylcn / serving-blueprint

COSS-India / dhruva-dpg

Improve this page

Add this topic to your repo