[Bug]: multi-modal model inference fails

### Your current environment

```text
The output of `python collect_env.py`
```


### 🐛 Describe the bug

Test script:

``` python
import asyncio, os, sys
import openai
from pathlib import Path

VLLM_PATH = os.path.join(Path(__file__).parent.parent, "vllm")
sys.path.append(VLLM_PATH)

from tests.utils import RemoteOpenAIServer

"""Path to root of the vLLM repository."""
async def test_chat(client: openai.AsyncOpenAI, model_name, input):
    # non-streaming
    chat_completion = await client.chat.completions.create(
        model=model_name,
        messages=input,
        max_tokens=1000)
    message = chat_completion.choices[0].message
    print(f"Message: {message}")

def test_serving():
    
    model_name = 'llava-hf/llava-1.5-7b-hf'
    args = [
        "--max-num-seqs", "128",
        "--gpu-memory-utilization", "0.9",
        "--disable-log-requests",
        "--trust-remote-code",
        "--chat-template", f"{VLLM_PATH}/examples/template_llava.jinja",
    ]
    with RemoteOpenAIServer(model_name, args) as remote_server:
        
        client_inst = remote_server.get_async_client()
    
        image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
        
        input = [{
            "role": "user", 
            "content": [
                {"type": "text", "text": "What’s in this image?"},
                {"type": "image_url", "image_url": {"url": image_url}},
            ]
        }]
        
        asyncio.run(test_chat(client_inst, model_name, input))
        
if __name__ == "__main__":
    print("VLLM_PATH is ", VLLM_PATH)
    test_serving()
```

error as below
![image](https://github.com/user-attachments/assets/04edae26-3014-417a-b892-e0c4a2d9099f)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: multi-modal model inference fails #282

Your current environment

🐛 Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: multi-modal model inference fails #282

Description

Your current environment

🐛 Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions