Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 151 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,151 @@
# Compressa LLM
# Deploy Platform

Pull images

```shell
compressa/compressa-pod:0.3.10
compressa/compressa-entrypoint:0.3.10
compressa/compressa-autotest:0.3.10
compressa/compressa-layout-gpu:0.3.10
compressa/compressa-ui:0.3.10
compressa/compressa-auth:0.3.10
nginx:latest
opensearchproject/opensearch:2.13.0
opensearchproject/opensearch-dashboards:2.13.0
opensearchproject/data-prepper:2.6.0
```


### Resources

Folders with data and models are currently mounted to docker container from the host machine.

Add necessary folders permissions for Docker.

To change their location, edit `RESOURCES_PATH` and `HF_HOME` variable in `.env` files:

```shell
# file: deploy/pod/.env
RESOURCES_PATH=/data/shared/CompressaAI/<DEPLOY>
DATASET_PATH=<FOLDER FOR LOADED DATASETS (USED FOR FINE TUNING ONLY)>
HF_HOME=/data/shared/CompressaAI/<FOLDER>
...
```

`DEPLOY` folder structure:

```
DEPLOY
└─models
│ │
│ └─models

```

1. Change access: `chmod -R 777 build`, `chmod -R 777 test_results`, `chmod -R 777 data-prepper`
2. Create common Docker network: `docker network create test_network`
3. Edit the config: `build/config.yaml`
4. Root directory: `cd deploy/platform`
5. Load environment variables
```bash
set -a
source .env
set +a
```
6. Run dispatcher `docker compose up`
7. The dispatcher generates `deploy/platform/build/auto.yaml` file according to requested number of pods
8. Run the generated docker compose file: `docker compose -f ./build/auto.yaml up`
9. Pods and dispatcher will be available at `http://localhost:8118/`
10. Nginx will be available at `http://localhost:9999/` (authorization required, refer to [Auth service manual](https://github.com/compressa-ai/app/blob/main/services/auth/README.md))
11. UI for chat - `http://localhost:9999/chat/` and layout `http://localhost:9999/ui-layout/` will be available in the browser.
12. OpenSearch Dashboard will be available at `http://localhost:5602/` (not connected to Nginx).

It is possible to edit Platforms `docker-compose.yaml` file to run Layout Model on GPU

```yaml
# .env
...
LAYOUT_RESOURCES_PATH=/data/shared/CompressaAI/<DEPLOY>
NETWORK=test_network
PROJECT=dev
PORT=8100
LAYOUT_GPU_IDS=2
...
```

```yaml
...
unstructured-api:
environment:
- LAYOUT_RESOURCES_PATH=${LAYOUT_RESOURCES_PATH:-./resources} # somewhere in RESOURCES_PATH
- PROJECT=${PROJECT:-compressa}
- PIPELINE_PACKAGE=${PIPELINE_PACKAGE:-general}
container_name: ${PROJECT:-compressa}-unstructured-api
volumes:
- ${LAYOUT_RESOURCES_PATH:-./resources}:/home/notebook-user/.cache
ports:
- ${PORT:-8100}:8000
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu
device_ids:
- ${LAYOUT_GPU_IDS:-0}
driver: nvidia
image: compressa/compressa-layout-gpu:0.3.9
restart: always
shm_size: 32g
networks:
- ${NETWORK:-common_network}
...
```

Then the Layout Model will be available with Nginx but not connected to Dispatcher.

---

# Deploy Pod

### Deploy only one compressa-pod instance
1. Edit `deploy-config.json` file.
2. Run Compressa (only one instance)
```bash
cd deploy/pod
set -a
source .env
set +a
docker compose up compressa-pod -d
```

The model will be available at `http://localhost:5000` and the service endpoints - at port `http://localhost:5100`.

### Configs

In config files the model engine and parameters can be specified, e.g.

```json
{
"model_id": "mixedbread-ai/mxbai-embed-large-v1",
"served_model_name": "Compressa-Embedding",
"dtype": "float16",
"backend": "vllm",
"task": "embeddings"
}

```

### Chat (For LLM Pod only)

Run
```bash
cd deploy/pod
set -a
source .env
set +a
docker compose up compressa-pod compressa-client-chat -d
```

Chat UI will be available in browser at `http://localhost:8501/chat`
8 changes: 0 additions & 8 deletions deploy-qwen14.json

This file was deleted.

96 changes: 0 additions & 96 deletions docker-compose.yaml

This file was deleted.

111 changes: 0 additions & 111 deletions nginx.conf

This file was deleted.

14 changes: 14 additions & 0 deletions platform/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
AUTODEPLOY=True # set False for usage with independent pods
RUN_TESTS=True # set False for usage with independent pods
UI_LOGIN=True
RESOURCES_PATH=./YOUR_RESOURSES_PATH
HF_HOME=./YOUR_HF_CACHE
COMPRESSA_API_KEY=YOUR_KEY
POD_NAME=test_pod
DISPATCHER_HOST=test-dispatcher
DISPATCHER_PORT=8118
NGINX_LISTEN_PORT=99
NGINX_TOKEN=TOKEN_1
NETWORK=test_network
PROJECT=dev
LOG_LEVEL=INFO
Loading