Skip to content

Commit c1e9ac6

Browse files
committed
add hpa docs
1 parent 6dbd79b commit c1e9ac6

File tree

2 files changed

+399
-0
lines changed

2 files changed

+399
-0
lines changed

site/content/en/docs/examples/_index.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,18 @@ weight: 6
55
description: >
66
This section contains examples of using LWS with or without specific inference runtime.
77
---
8+
9+
This section provides practical examples of using LeaderWorkerSet (LWS) in various scenarios:
10+
11+
## Infrastructure Examples
12+
13+
- **[Horizontal Pod Autoscaler (HPA)](hpa/)** - Configure automatic scaling based on resource utilization
14+
15+
## Inference Runtime Examples
16+
17+
- **[vLLM](vllm/)** - Deploy distributed inference with vLLM on GPUs/TPUs
18+
- **[TensorRT-LLM](tensorrt-llm/)** - High-performance inference with TensorRT-LLM
19+
- **[SGLang](sglang/)** - Structured generation language inference
20+
- **[LlamaCPP](llamacpp/)** - CPU-based inference with LlamaCPP
21+
22+
Each example includes detailed configuration files, deployment instructions, and best practices for production use.

0 commit comments

Comments
 (0)