Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion tools/dynamic-lora-sidecar/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM python:3.9-slim-buster AS test
FROM python:3.10-slim-buster AS test

WORKDIR /dynamic-lora-reconciler-test
COPY requirements.txt .
Expand Down
33 changes: 33 additions & 0 deletions tools/dynamic-lora-sidecar/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Makefile for dynamic-lora-sidecar

PYTHON_VERSION := 3.10
VENV_DIR := venv
PYTHON := $(VENV_DIR)/bin/python
PIP := $(VENV_DIR)/bin/pip

.PHONY: help venv install test clean

help: ## Show available targets
@echo "Available targets:"
@echo " venv - Create virtual environment"
@echo " install - Install dependencies"
@echo " test - Run unit tests"
@echo " clean - Clean up virtual environment"

venv: $(VENV_DIR)/bin/activate ## Create virtual environment

$(VENV_DIR)/bin/activate:
python$(PYTHON_VERSION) -m venv $(VENV_DIR)

install: venv ## Install dependencies
$(PIP) install --upgrade pip
$(PIP) install -r requirements.txt

test: install ## Run unit tests
$(PYTHON) -m unittest discover -v -s sidecar

clean: ## Clean up virtual environment
rm -rf $(VENV_DIR)
rm -rf .pytest_cache
find . -name "*.pyc" -delete
find . -name "__pycache__" -type d -exec rm -rf {} +
15 changes: 13 additions & 2 deletions tools/dynamic-lora-sidecar/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Dynamic LORA Adapter Sidecar for vLLM

This is a sidecar-based tool to help rolling out new LoRA adapters to a set of running vLLM model servers. The user deploys the sidecar with a vLLM server, and using a ConfigMap, the user can express their intent as to which LoRA adapters they want to have the running vLLM servers to be configure with. The sidecar watches the ConfigMap and sends load/unload requests to the vLLM container to actuate on the user intent.
This is a sidecar-based tool to help rolling out new LoRA adapters to a set of running vLLM model servers. The user deploys the sidecar with a vLLM server, and using a ConfigMap, the user can express their intent as to which LoRA adapters they want to have the running vLLM servers to be configure with. The sidecar watches the ConfigMap and sends load/unload requests to the vLLM container to actuate on the user intent.

## Overview

Expand Down Expand Up @@ -48,6 +48,17 @@ The sidecar uses the vLLM server's API to load or unload adapters based on the c
```
Do not use subPath, since configmap updates are not reflected in the file

## Development

For local development and testing, use the provided Makefile:

```bash
make venv # Create Python 3.10 virtual environment
make install # Install dependencies
make test # Run unit tests
make clean # Clean up
```

## Command Line Arguments

The sidecar supports the following command-line arguments:
Expand All @@ -59,7 +70,7 @@ The sidecar supports the following command-line arguments:
- `--config-validation`: Enable config validation (default: True)

## Configuration Fields
- `vLLMLoRAConfig`[**required**] base key
- `vLLMLoRAConfig`[**required**] base key
- `host` [*optional*] Model server's host. defaults to localhost
- `port` [*optional*] Model server's port. defaults to 8000
- `name` [*optional*] Name of this config
Expand Down