Intelligent load balancer for distributed vLLM server clusters 分布式 vLLM 服务器集群的智能负载均衡器
-
Updated
Oct 22, 2025 - Python
Intelligent load balancer for distributed vLLM server clusters 分布式 vLLM 服务器集群的智能负载均衡器
agentsculptor is an experimental AI-powered development agent designed to analyze, refactor, and extend Python projects automatically. It uses an OpenAI-like planner–executor loop on top of a vLLM backend, combining project context analysis, structured tool calls, and iterative refinement. It has only been tested with gpt-oss-120b via vLLM.
Deploy the Magistral-Small-2506 model using vLLM and Modal
[KAIST CS632] Road damage detection using YOLOv8 on Xilinx FPGA, repair estimation with vLLM-Serve Phi-3.5 FAISS RAG, and data management via GS1 EPCISv2 and React dashboard
Performant LLM inferencing on Kubernetes via vLLM
This Repository contains terraform configuration for vllm production-stack in the cloud managed K8s
Load testing openai/gpt-oss-20b with vLLM and Docker
Open WebUI w/ vLLM engine 🔥 🆒
Production-grade vLLM serving with an OpenAI-compatible API, per-request LoRA routing, KEDA autoscaling on Prometheus metrics, Grafana/OTel observability, and a benchmark comparing AWQ vs GPTQ vs GGUF.
A simple app to generate caption for your instagram post using `JoyCaption` model hosted in RunPod.io
Finetunned llm for domain usecase and inference using the vllm and serving on ollama
[2024 Elice AI Hellothon Excellence Award (2nd Place)] Caregiver cognitive activity lesson guide creator and elderly interactive AI drawing diary service, Saem, Sam
Đây là mô hình OCR được tinh chỉnh từ Vintern1B (InternVL 1B) với 1 tỷ tham số. Mô hình có khả năng nhận diện văn bản trong nhiều ngữ cảnh khác nhau như chữ viết tay, chữ in, và văn bản trên các đối tượng thực tế.
Project to set up a UI for users to interact with a LLM being served using vLLM, working on deploying it
Add a description, image, and links to the vllm-serve topic page so that developers can more easily learn about it.
To associate your repository with the vllm-serve topic, visit your repo's landing page and select "manage topics."