Closed
Description
Your current environment
I deploy models with the vllm/vllm-openai:v0.9.0.1
docker image.
The command line arguments: --tensor-parallel-size 2
and --enforce-eager
.
🐛 Describe the bug
The CPU utlization is high even if no tasks are running.

This does not depend on models. I encountered the issue with both qwen3 and devstral models.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.