[Bug] baichuan-13b-chat Service exception after long run

Start command
```
python -m vllm.entrypoints.openai.api_server --model baichuan-inc/Baichuan-13B-Chat --host 0.0.0.0 --port 8777 --trust-remote-code --dtype half
```

After about 12 hours of operation, the inference service stopped working

GPU：V100 
CUDA：11.4

Screenshot of the problem：
![Xnip2023-08-05_12-03-19](https://github.com/vllm-project/vllm/assets/33508761/68f881f4-4d18-4e2f-8751-85770b5367b8)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug] baichuan-13b-chat Service exception after long run #677

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] baichuan-13b-chat Service exception after long run #677

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions