Closed
Description
Your current environment
In vllm version 0.8.3, I launched the deepseek-32B model using four 4090 cards. Even without inference, the CPU usage rate was at 100%. Is this normal?
🐛 Describe the bug
In vllm version 0.8.3, I launched the deepseek-32B model using four 4090 cards. Even without inference, the CPU usage rate was at 100%. Is this normal?
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.