Skip to content
This repository was archived by the owner on Apr 8, 2024. It is now read-only.

Improve examples in fal-serverless #881

Merged
merged 1 commit into from
Jul 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docsite/docs/fal-serverless/authentication/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
label: "Authentication"
position: 2
position: 3
collapsible: false
collapsed: false
link:
Expand Down
2 changes: 1 addition & 1 deletion docsite/docs/fal-serverless/examples/_category_.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
label: "Examples"
position: 8
position: 2
collapsible: false
collapsed: false
link:
Expand Down
168 changes: 0 additions & 168 deletions docsite/docs/fal-serverless/examples/chat.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
# Deploying a Custom ControlNet Model Using fal-serverless
---
sidebar_position: 2
---

fal-serverless is a serverless platform that enables you to run Python functions on cloud infrastructure. In this example, we will demonstrate how to use fal-serverless for deploying a custom ControlNet model.
# Restyle Room Photos with ControlNet
In this example, we will demonstrate how to use fal-serverless for deploying a ControlNet model.

## 1. Create a new file called controlnet.py
```python
from __future__ import annotations
from fal_serverless import isolated, cached
Expand All @@ -10,7 +14,6 @@ from pathlib import Path
import base64
import io


requirements = [
"controlnet-aux",
"diffusers",
Expand All @@ -21,12 +24,6 @@ requirements = [
"xformers"
]

def image_to_base64(image_path):
with open(image_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode("utf-8")
return encoded_string


def read_image_bytes(file_path):
with open(file_path, "rb") as file:
image_bytes = file.read()
Expand Down Expand Up @@ -69,10 +66,6 @@ def resize_image(input_image, resolution):
)
return img

def save_image_from_bytes(image_bytes, output_path):
with open(output_path, "wb") as file:
file.write(image_bytes)

@isolated(
requirements=requirements,
machine_type="GPU",
Expand All @@ -91,7 +84,6 @@ def generate(
pipe = load_model()
image = Image.open(io.BytesIO(image_bytes))


canny = CannyDetector()
init_image = image.convert("RGB")

Expand Down Expand Up @@ -123,3 +115,23 @@ def generate(
list_of_bytes = [read_image_bytes(out_dir / f) for f in file_names]
return list_of_bytes
```

## 2. Deploy the model as an endpoint
To use this fal-serverless function as an API, you can serve it with the `fal-serverless` CLI command:

```bash
fal-serverless fn serve controlnet.py generate --alias controlnet --auth public
```

This will return a URL like:
```
Registered a new revision for function 'controltest' (revision='c75db134-23f0-4863-94cd-3358d6c8d94c').
URL: https://user_id-controlnet.gateway.alpha.fal.ai
```

## 3. Test it out
```bash
curl https://user_id-controlnet.gateway.alpha.fal.ai/ -H 'content-type: application/json' -H 'accept: application/json, */*;q=0.5' -d '{"image_url":"https://restore.tchabitat.org/hubfs/blog/2019%20Blog%20Images/July/Old%20Kitchen%20Cabinets%20-%20Featured%20Image.jpg","prompt":"scandinavian kitchen","num_samples":1,"num_steps":30}'
```

This should return a JSON with the image encoded in base64.
2 changes: 1 addition & 1 deletion docsite/docs/fal-serverless/examples/image-restoration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
sidebar_position: 4
---

# Image restoration with Transformers
# Restore Old Images with Transformers

In this example, we will demonstrate how to use the [SwinIR](https://github.com/JingyunLiang/SwinIR) library and fal-serverless to restore images. SwinIR is an image restoration library that uses a Swin Transformer to restore images. The [Swin Transformer](https://arxiv.org/abs/2103.14030) is a type of neural network architecture that is designed for processing images. The Swin Transformer is similar to the popular Vision Transformer (ViT) architecture, but it uses a hierarchical structure that allows it to process images more efficiently. SwinIR uses a pre-trained Swin Transformer to restore images.

Expand Down
107 changes: 107 additions & 0 deletions docsite/docs/fal-serverless/examples/llama.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
sidebar_position: 3
---

# Run LLMs with llama.cpp (OpenAI API Compatible Server)

In this example, we will demonstrate how to use fal-serverless for deploying any llama based language model and serving it through a OpenAI API compatible server with SSE.

# 1. Use already deployed example

If you want to use an already deployed API, here is a public endpoint running on a T4:

https://110602490-llama-server.gateway.alpha.fal.ai/docs

To see this API in action:

```bash
curl -X POST -H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-H "Authorization: Access-Control-Allow-Origin: *" \
-d '{
"messages": [
{
"role": "user",
"content": "can you write a happy story"
}
],
"stream": true,
"model": "gpt-3.5-turbo",
"max_tokens": 2000
}' \
https://110602490-llama-server.gateway.alpha.fal.ai/v1/chat/completions \
```

This should return a streaming response.

# 2. To deploy your own version:

In this example, we will use the conda backend so that we can install CUDA dependencies. First, create the files below:

**llama_cpp_env.yml**

```yaml
name: myenv
channels:
- conda-forge
- nvidia/label/cuda-12.0.1
dependencies:
- cuda-toolkit
- pip
- pip:
- llama-cpp-python[server]
- cmake
- setuptools
```

**llama_cpp.py**

```python
from fal_serverless import isolated, cached

MODEL_URL = "https://huggingface.co/TheBloke/Vicuna-7B-CoT-GGML/resolve/main/vicuna-7B-cot.ggmlv3.q4_0.bin"
MODEL_PATH = "/data/models/vicuna-7B-cot.ggmlv3.q4_0.bin"

@isolated(
kind="conda",
env_yml="llama_cpp_env.yml",
machine_type="M",
)
def download_model():
print("---> This is download_model()")
import os

if not os.path.exists("/data/models"):
os.system("mkdir /data/models")
if not os.path.exists(MODEL_PATH):
print("Downloading SAM model.")
os.system(f"cd /data/models && wget {MODEL_URL}")

@isolated(
kind="conda",
env_yml="llama_cpp_env.yml",
machine_type="GPU-T4",
exposed_port=8080,
keep_alive=30
)
def llama_server():
import uvicorn
from llama_cpp.server import app

settings = app.Settings(model=MODEL_PATH, n_gpu_layers=96)

server = app.create_app(settings=settings)
uvicorn.run(server, host="0.0.0.0", port=8080)
```

This script has two main functions: one two download the model, and the second one to start the server.

We first need to download the model. You do this by calling the `download_model()` from a Python context.

We then deploy this as a public endpoint:

```bash
fal-serverless function serve llama_cpp.py llama_server --alias llama-server --auth public
```

This should return a URL, and you can use it like the above. First deploy might take a little bit of time.
2 changes: 1 addition & 1 deletion docsite/docs/fal-serverless/examples/sentiment-analysis.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
sidebar_position: 1
sidebar_position: 6
---

# Sentiment Analysis with dbt
Expand Down
Loading