A high-throughput and memory-efficient inference and serving engine for LLMs
-
Updated
Sep 8, 2025 - Python
A high-throughput and memory-efficient inference and serving engine for LLMs
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
โ๏ธ Build multimodal AI applications with cloud-native stack
<โก๏ธ> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
Build, Manage and Deploy AI/ML Systems
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Open-source observability for your LLM application, based on OpenTelemetry
Superduper: End-to-end framework for building custom AI applications and agents.
ZenML ๐: MLOps for Reliable AI: from Classical AI to Agents. https://zenml.io.
๐ข Open-Source Evaluation & Testing library for LLM Agents
cube studioๅผๆบไบๅ็ไธ็ซๅผๆบๅจๅญฆไน /ๆทฑๅบฆๅญฆไน /ๅคงๆจกๅAIๅนณๅฐ๏ผmlops็ฎๆณ้พ่ทฏๅ จๆต็จ๏ผ็ฎๅ็ง่ตๅนณๅฐ๏ผnotebookๅจ็บฟๅผๅ๏ผๆๆๆฝไปปๅกๆตpipeline็ผๆ๏ผๅคๆบๅคๅกๅๅธๅผ่ฎญ็ป๏ผ่ถ ๅๆ็ดข๏ผๆจ็ๆๅกVGPU่ๆๅ๏ผ่พน็ผ่ฎก็ฎ๏ผๆ ๆณจๅนณๅฐ่ชๅจๅๆ ๆณจ๏ผdeepseek็ญๅคงๆจกๅsftๅพฎ่ฐ/ๅฅๅฑๆจกๅ/ๅผบๅๅญฆไน ่ฎญ็ป๏ผvllm/ollama/mindieๅคงๆจกๅๅคๆบๆจ็๏ผ็งๆ็ฅ่ฏๅบ๏ผAIๆจกๅๅธๅบ๏ผๆฏๆๅฝไบงcpu/gpu/npu ๆ่ พ็ๆ๏ผๆฏๆRDMA๏ผๆฏๆpytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/ray/volcano็ญๅๅธๅผ
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
๐ค ๐๐ฒ๐ฎ๐ฟ๐ป for ๐ณ๐ฟ๐ฒ๐ฒ how to ๐ฏ๐๐ถ๐น๐ฑ an end-to-end ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป-๐ฟ๐ฒ๐ฎ๐ฑ๐ ๐๐๐ & ๐ฅ๐๐ ๐๐๐๐๐ฒ๐บ using ๐๐๐ ๐ข๐ฝ๐ best practices: ~ ๐ด๐ฐ๐ถ๐ณ๐ค๐ฆ ๐ค๐ฐ๐ฅ๐ฆ + 12 ๐ฉ๐ข๐ฏ๐ฅ๐ด-๐ฐ๐ฏ ๐ญ๐ฆ๐ด๐ด๐ฐ๐ฏ๐ด
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Add a description, image, and links to the llmops topic page so that developers can more easily learn about it.
To associate your repository with the llmops topic, visit your repo's landing page and select "manage topics."