Support `ChatCompletion` Endpoint in OpenAI demo server

@infwinston Feel free to use FastChat's completion template to implement a chat completion endpoint in our demo server. You can use the completion API as a reference:

https://github.com/vllm-project/vllm/blob/9d27b09d12767de775a92d765e177a61f8477189/vllm/entrypoints/openai/api_server.py#L88-L101

	@app.post("/v1/completions")
	async def create_completion(raw_request: Request):
	"""Completion API similar to OpenAI's API.

	See https://platform.openai.com/docs/api-reference/completions/create
	for the API specification. This API mimics the OpenAI Completion API.

	NOTE: Currently we do not support the following features:
	- echo (since the vLLM engine does not currently support
	getting the logprobs of prompt tokens)
	- suffix (the language models we currently support do not support
	suffix)
	- logit_bias (to be supported by vLLM engine)
	"""

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support `ChatCompletion` Endpoint in OpenAI demo server #311

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support ChatCompletion Endpoint in OpenAI demo server #311

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Support `ChatCompletion` Endpoint in OpenAI demo server #311