Skip to content

[Bug]: The OpenAI-compatible endpoint does not work with sglang #2228

Closed
@jvstme

Description

@jvstme

Steps to reproduce

Apply this configuration:

type: service
name: deepseek-r1

image: lmsysorg/sglang:latest
env:
  - MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
commands:
  - python3 -m sglang.launch_server
    --model-path $MODEL_ID
    --port 8000
    --trust-remote-code

port: 8000
# Register the model

model:
  name: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
  type: chat
  format: openai


# Uncomment to cache downloaded models
#volumes:
#  - /root/.cache/huggingface/hub:/root/.cache/huggingface/hub

# Disable authorization
auth: false

resources:
    gpu: 24GB

Try requesting the OpenAI-compatible endpoint.

Actual behaviour

curl http://localhost:3000/proxy/models/bihan/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <Token>" \
  -d '{
  "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
  "messages": [
    {
      "role": "user",
      "content": "Hello world"
    }
  ],
  "stream": true,
  "max_tokens": 512
}'

{"detail":"Invalid chunk in model stream: 1 validation error for ChatCompletionsChunkResponse\nchoices -> 0 -> finish_reason\n  unexpected value; permitted: 'stop', 'length', 'tool_calls', 'eos_token' (type=value_error.const; given=; permitted=('stop', 'length', 'tool_calls', 'eos_token'))"}

Expected behaviour

The request returns chat completions chunks.

dstack version

0.18.37

Server logs

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions