Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions Lisat-deploy/DEPLOYMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# LISAT Deployment Guide

This directory contains deployment options for LISAT inference:
- **Modal** - Serverless GPU deployment
- **Docker** - Self-hosted container deployment

---

## Option 1: Modal Deployment

### Prerequisites
- Install the Modal CLI: `pip install modal`
- Authenticate with your Modal account:

```bash
modal token set --token-id <YOUR_TOKEN_ID> --token-secret <YOUR_TOKEN_SECRET>
```

### Deploy the service
```bash
# Optional: preload weights to reduce first-request latency
modal run modal_app.py warmup

# Deploy the HTTPS endpoint (prints the public URL)
modal deploy modal_app.py
```

The deployment bundles:
- `lisat_runtime.py` (model + segmentation helpers)
- Modal image with `LISAt_code`, `jquenum/LISAt-7b`, patched `transformers/utils/versions.py`
- HTTP endpoint `POST /infer` expecting JSON input

## 3. Request format
Body fields:
- `prompt` (string, required)
- `image_url` (string, optional)
- `image_base64` (string, optional)
- `max_new_tokens` (int, optional, default `512`, capped at `1024`)

At least one of `image_url` or `image_base64` must be present. If both are sent, the base64 payload is preferred.

## 4. Response format
```json
{
"text": "Generated description",
"has_seg": true,
"mask_base64": "<PNG bytes>",
"mask_shape": [1, 1024, 1024],
"input_prompt": "Can you segment the car?",
"has_image": true,
"context_len": 4096
}
```

`mask_base64` will be `null` when no object is segmented.

## 5. Sample requests
**URL-based image**
```bash
curl -X POST "$MODAL_DEPLOY_URL/infer" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Segment the stadium in this aerial photo",
"image_url": "https://example.com/aerial.png"
}'
```

**Base64 image**
```bash
IMG_B64=$(base64 -w0 image.png)
curl -X POST "$MODAL_DEPLOY_URL/infer" \
-H "Content-Type: application/json" \
-d "{
\"prompt\": \"Find the red car\",
\"image_base64\": \"${IMG_B64}\",
\"max_new_tokens\": 256
}"
```

## 6. Notes & troubleshooting
- The Modal image installs `transformers==4.31.0` and applies runtime patches.
- `git-lfs` clones both `LISAt_code` and the HF weights inside the container.
- Use `modal logs <deploy-name>` to inspect runtime errors.

---

## Option 2: Docker Deployment

### Build the image
```bash
cd Lisat-deploy
docker build -t lisat:latest .
```

### Run API Server
```bash
# Start the FastAPI server on port 8000
docker run --gpus all -p 8000:8000 lisat:latest
```

Test with:
```bash
curl -X POST "http://localhost:8000/segment" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Segment the building",
"image_url": "https://example.com/image.png"
}'
```

### Run Jupyter Lab
```bash
# Start Jupyter Lab on port 8888
docker run --gpus all -p 8888:8888 lisat:latest jupyter
```

Then open http://localhost:8888 in your browser.

### Run Custom Command
```bash
# Run a custom Python script
docker run --gpus all lisat:latest python /app/LISAt_code/demo.py
```

---

## Directory Structure

```
Lisat-deploy/
├── Dockerfile # Docker build file
├── entrypoint.sh # Container entrypoint
├── inference_server.py # FastAPI server for Docker
├── modal_app.py # Modal serverless deployment
├── lisat-new.ipynb # Reference Jupyter notebook
└── MODAL_DEPLOY.md # This file
```

91 changes: 91 additions & 0 deletions Lisat-deploy/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# LISAT Docker Image
# Supports both Jupyter notebook and direct Python inference
#
# Build: docker build -t lisat:latest .
# Run API: docker run --gpus all -p 8000:8000 lisat:latest
# Run Jupyter: docker run --gpus all -p 8888:8888 lisat:latest jupyter

FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04

# Prevent interactive prompts
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
git-lfs \
ffmpeg \
libgl1 \
curl \
python3.12 \
python3.12-venv \
python3-pip \
&& rm -rf /var/lib/apt/lists/* \
&& git lfs install --system

# Set Python 3.12 as default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.12 1 \
&& update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1

WORKDIR /app

# Install Python dependencies
RUN pip install --upgrade pip && pip install uv

# Core dependencies
RUN uv pip install --system \
einops \
numpy \
Pillow \
requests \
torch==2.3.0 \
torchvision==0.18.0 \
accelerate \
safetensors \
regex \
sentencepiece \
fastapi \
uvicorn \
packaging \
pyyaml \
filelock \
tqdm \
protobuf \
scipy \
matplotlib \
jupyterlab

# Install specific versions of HF libraries (order matters - no-deps to avoid conflicts)
RUN uv pip install --system --no-cache-dir --no-deps \
huggingface-hub==0.26.1 \
transformers==4.31.0 \
tokenizers==0.14.0 \
peft==0.4.0

# Patch transformers to disable strict version checks
RUN sed -i 's/require_version_core(deps\[pkg\])/# require_version_core(deps[pkg])/g' \
/usr/local/lib/python3.12/dist-packages/transformers/dependency_versions_check.py

# Clone LISAT code and model weights
RUN git clone --depth 1 https://github.com/wildcraft958/LISAt_code.git /app/LISAt_code
RUN git clone https://huggingface.co/jquenum/LISAt-7b /app/LISAt-7b

# Set environment
ENV PYTHONPATH=/app/LISAt_code
ENV HF_HOME=/app/.cache/huggingface
ENV MODEL_PATH=/app/LISAt-7b

# Copy inference server
COPY inference_server.py /app/inference_server.py

# Expose ports
EXPOSE 8000 8888

# Default: run API server
# Override with "jupyter" to run Jupyter Lab
COPY entrypoint.sh /app/entrypoint.sh
RUN chmod +x /app/entrypoint.sh

ENTRYPOINT ["/app/entrypoint.sh"]
CMD ["api"]
14 changes: 14 additions & 0 deletions Lisat-deploy/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash
set -e

if [ "$1" = "jupyter" ]; then
echo "Starting Jupyter Lab..."
cd /app/LISAt_code
exec jupyter lab --ip=0.0.0.0 --port=8888 --no-browser --allow-root --NotebookApp.token='' --NotebookApp.password=''
elif [ "$1" = "api" ]; then
echo "Starting LISAT API server..."
exec python /app/inference_server.py
else
# Run custom command
exec "$@"
fi
Loading