Skip to content

Latest commit

 

History

History
104 lines (81 loc) · 3.06 KB

openai_api.md

File metadata and controls

104 lines (81 loc) · 3.06 KB

OpenAI-Compatible RESTful APIs & SDK

FastChat provides OpenAI-Compatible RESTful APIs for the its supported models (e.g. Vicuna). The following OpenAI APIs are supported:

RESTful API Server

First, launch the controller

python3 -m fastchat.serve.controller

Then, launch the model worker(s)

python3 -m fastchat.serve.model_worker --model-name 'vicuna-7b-v1.1' --model-path /path/to/vicuna/weights

Finally, launch the RESTful API server

python3 -m fastchat.serve.api_server --host localhost --port 8000

Test the API server

Chat Completions

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Text Completions

curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "prompt": "Once upon a time",
    "max_tokens": 41,
    "temperature": 0.5
  }'

Embeddings

curl http://localhost:8000/v1/create_embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vicuna-7b-v1.1",
    "input": "Hello, can you tell me a joke"
  }'

Client SDK

Assuming environment variable FASTCHAT_BASEURL is set to the API server URL (e.g., http://localhost:8000), you can use the following code to send a request to the API server:

import os
from fastchat import client

client.set_baseurl(os.getenv("FASTCHAT_BASEURL"))

completion = client.ChatCompletion.create(
  model="vicuna-7b-v1.1",
  messages=[
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

Machine Learning with Embeddings

You can use create_embedding to

To these tests, you need to download the data here. You also need an OpenAI API key for comparison.

Run with:

cd playground/test_embedding
python3 test_classification.py

The script will train classifiers based on vicuna-7b, text-similarity-ada-001 and text-embedding-ada-002 and report the accuracy of each classifier.

Todos

Some features to be implemented:

  • Streaming
  • Support of some parameters like top_p, presence_penalty
  • Proper error handling (e.g. model not found)
  • The return value in the client SDK could be used like a dict