openchat/openchat-3.5-1210 · Hugging Face #418
Labels
base-model
llm base models not finetuned for chat
chat-templates
llm prompt templates for chat models
llm
Large Language Models
llm-inference-engines
Software to run inference on large language models
ml-inference
Running and serving ML models.
Models
LLM and ML model repos and links
openai
OpenAI APIs, LLMs, Recipes and Evals
technical-writing
Links to deep technical writing and books
Using the OpenChat Model
We highly recommend installing the OpenChat package and using the OpenChat OpenAI-compatible API server for an optimal experience. The server is optimized for high-throughput deployment using vLLM and can run on a consumer GPU with 24GB RAM.
Installation Guide: Follow the installation guide in our repository.
Serving: Use the OpenChat OpenAI-compatible API server by running the serving command from the table below. To enable tensor parallelism, append
--tensor-parallel-size N
to the serving command.python -m ochat.serving.openai_api_server --model openchat/openchat-3.5-1210 --engine-use-ray --worker-use-ray
API Usage: Once started, the server listens at
localhost:18888
for requests and is compatible with the OpenAI ChatCompletion API specifications. Here's an example request:Web UI: Use the OpenChat Web UI for a user-friendly experience.
Online Deployment
If you want to deploy the server as an online service, use the following options:
--api-keys sk-KEY1 sk-KEY2 ...
to specify allowed API keys--disable-log-requests --disable-log-stats --log-file openchat.log
for logging only to a file.For security purposes, we recommend using an HTTPS gateway in front of the server.
Mathematical Reasoning Mode
The OpenChat model also supports mathematical reasoning mode. To use this mode, include
condition: "Math Correct"
in your request.Conversation Templates
We provide several pre-built conversation templates to help you get started.
Default Mode (GPT4 Correct):
Mathematical Reasoning Mode:
NOTE: Remember to set
<|end_of_turn|>
as end of generation token.Integrated Tokenizer: The default (GPT4 Correct) template is also available as the integrated tokenizer.chat_template, which can be used instead of manually specifying the template.
Suggested labels
{ "label": "chat-templates", "description": "Pre-defined conversation structures for specific modes of interaction." }
The text was updated successfully, but these errors were encountered: