[Usage]: run gguf model need template，how to write？ 

### Your current environment

BadRequestError: Error code: 400 - {'object': 'error', 'message': 'As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.', 'type': 'BadRequestError', 'param': None, 'code': 400}


### How would you like to use vllm


 CUDA_VISIBLE_DEVICES=1 vllm serve  /ai/qwen1.5-1.8b.gguf --host 0.0.0.0  --port 10868 --max-model-len 4096   --trust-remote-code --tensor-parallel-size 1  --dtype=half --quantization gguf --load-format gguf

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: run gguf model need template，how to write？ #7978

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Usage]: run gguf model need template，how to write？ #7978

Description

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions