File tree Expand file tree Collapse file tree 2 files changed +399
-0
lines changed
site/content/en/docs/examples Expand file tree Collapse file tree 2 files changed +399
-0
lines changed Original file line number Diff line number Diff line change @@ -5,3 +5,18 @@ weight: 6
55description : >
66 This section contains examples of using LWS with or without specific inference runtime.
77---
8+
9+ This section provides practical examples of using LeaderWorkerSet (LWS) in various scenarios:
10+
11+ ## Infrastructure Examples
12+
13+ - ** [ Horizontal Pod Autoscaler (HPA)] ( hpa/ ) ** - Configure automatic scaling based on resource utilization
14+
15+ ## Inference Runtime Examples
16+
17+ - ** [ vLLM] ( vllm/ ) ** - Deploy distributed inference with vLLM on GPUs/TPUs
18+ - ** [ TensorRT-LLM] ( tensorrt-llm/ ) ** - High-performance inference with TensorRT-LLM
19+ - ** [ SGLang] ( sglang/ ) ** - Structured generation language inference
20+ - ** [ LlamaCPP] ( llamacpp/ ) ** - CPU-based inference with LlamaCPP
21+
22+ Each example includes detailed configuration files, deployment instructions, and best practices for production use.
You can’t perform that action at this time.
0 commit comments