Closed
Description
We need to change the conversation template when we use our fine-tuned MPT 30b model.
https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/api_server.py#L66
I think this feature is important especially when we use vLLM in production (with fine-tuned models).