Skip to content

Commit b42a6d4

Browse files
committed
Update documentation from main repository
1 parent 4999a15 commit b42a6d4

File tree

2 files changed

+20
-9
lines changed

2 files changed

+20
-9
lines changed

docs/stable/features/live_migration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ To run this example, we will use Docker Compose to set up a ServerlessLLM cluste
1919
- **At least 20 GB of host memory** (this can be adjusted by using smaller models).
2020
- **ServerlessLLM version 0.6**: Ensure you have `sllm==0.6` and `sllm-store==0.6` installed.
2121

22-
##
22+
## Usage
2323

2424
Start a local Docker-based ray cluster using Docker Compose.
2525

docs/stable/features/peft_lora_serving.md

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,23 +3,36 @@ sidebar_position: 2
33
---
44
# PEFT LoRA Serving
55

6+
This example illustrates the process of deploying and serving a base large language model enhanced with LoRA (Low-Rank Adaptation) adapters in a ServerlessLLM cluster. It demonstrates how to start the cluster, deploy a base model with multiple LoRA adapters, perform inference using different adapters, and update or remove the adapters dynamically.
7+
68
## Pre-requisites
79

810
To run this example, we will use Docker Compose to set up a ServerlessLLM cluster. Before proceeding, please ensure you have read the [Quickstart Guide](../getting_started.md).
911

10-
We will use the following example base model & LoRA adapter
12+
We will use the following example base model & LoRA adapters
1113
- Base model: `facebook/opt-125m`
12-
- LoRA adapter: `peft-internal-testing/opt-125m-dummy-lora`
14+
- LoRA adapters:
15+
- `peft-internal-testing/opt-125m-dummy-lora`
16+
- `monsterapi/opt125M_alpaca`
17+
- `edbeeching/opt-125m-lora`
18+
- `Hagatiana/opt-125m-lora`
1319

1420
## Usage
1521

1622
Start a local Docker-based ray cluster using Docker Compose.
1723

18-
### Step 1: Clone the ServerlessLLM Repository
24+
### Step 1: Download the Docker Compose File
1925

20-
If you haven't already, clone the ServerlessLLM repository:
26+
Download the `docker-compose.yml` file from the ServerlessLLM repository:
2127
```bash
22-
git clone https://github.com/ServerlessLLM/ServerlessLLM.git
28+
# Create a directory for the ServerlessLLM Docker setup
29+
mkdir serverless-llm-docker && cd serverless-llm-docker
30+
31+
# Download the docker-compose.yml file
32+
curl -O https://raw.githubusercontent.com/ServerlessLLM/ServerlessLLM/main/examples/docker/docker-compose.yml
33+
34+
# Alternatively, you can use wget:
35+
# wget https://raw.githubusercontent.com/ServerlessLLM/ServerlessLLM/main/examples/docker/docker-compose.yml
2336
```
2437

2538
### Step 2: Configuration
@@ -87,10 +100,8 @@ curl $LLM_SERVER_URL/v1/chat/completions \
87100
```
88101
### Step 5: Update LoRA Adapters
89102
If you wish to switch to a different set of LoRA adapters, you can still use `sllm-cli deploy` command with updated adapter configurations. ServerlessLLM will automatically reload the new adapters without restarting the backend.
90-
91-
For example, to update the adapter (located at `ft_facebook/opt-125m_adapter1`) used by facebook/opt-125m:
92103
```bash
93-
sllm-cli deploy --model facebook/opt-125m --backend transformers --enable-lora --lora-adapters demo_lora=ft_facebook/opt-125m_adapter1
104+
sllm-cli deploy --model facebook/opt-125m --backend transformers --enable-lora --lora-adapters demo-lora1=edbeeching/opt-125m-lora demo-lora2=Hagatiana/opt-125m-lora
94105
```
95106

96107
### Step 6: Clean Up

0 commit comments

Comments
 (0)