-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Description
Proposal to improve performance
No response
Report of performance regression
build with latest vllm code and start Qwen2-VL-7B-Instruct
It takes too long time to handle preprocess lead to heartbeat timeout.
ERROR 10-10 01:14:54 client.py:250] RuntimeError('Engine loop has died')
ERROR 10-10 01:14:54 client.py:250] Traceback (most recent call last):
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 150, in run_heartbeat_loop
ERROR 10-10 01:14:54 client.py:250] await self._check_success(
ERROR 10-10 01:14:54 client.py:250] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/client.py", line 314, in _check_success
ERROR 10-10 01:14:54 client.py:250] raise response
ERROR 10-10 01:14:54 client.py:250] RuntimeError: Engine loop has died
ERROR 10-10 01:25:08 client.py:250] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-10 01:25:08 client.py:250] NoneType: None
DEBUG 10-10 01:25:08 client.py:144] Shutting down MQLLMEngineClient check health loop due to timeout
DEBUG 10-10 01:25:14 client.py:170] Waiting for output from MQLLMEngine.
CRITICAL 10-10 01:25:14 launcher.py:99] MQLLMEngine is already dead, terminating server process
Any suggestion to help improve preprocess preformance?
Misc discussion on performance
No response
Your current environment (if you think it is necessary)
The output of `python collect_env.py`
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.