Closed
Description
Your current environment
The output of `python collect_env.py`
I want to run benchmarks locally to testing the performance of multimodal models, and notice that we have consolidate datasets on #14036, only found mutlimodal dataset support with benchmark_serving, i wonder if there any way to get the throughput and latency result offline with multimodal input.
How would you like to use vllm
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.