Skip to content

Commit

Permalink
[Docs] Add Docs on Limitations of VLM Support (vllm-project#5383)
Browse files Browse the repository at this point in the history
  • Loading branch information
ywang96 authored and jimpang committed Jun 27, 2024
1 parent 1691e08 commit e6aa5ff
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ def setup(app):
"vllm._C",
"PIL",
"numpy",
'triton'
"tqdm",
"tensorizer",
]
Expand Down
9 changes: 8 additions & 1 deletion docs/source/models/vlm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ The following :ref:`engine arguments <engine_args>` are specific to VLMs:
:prog: -m vllm.entrypoints.openai.api_server
:nodefaultconst:

.. important::
Currently, the support for vision language models on vLLM has the following limitations:

* Only single image input is supported per text prompt.
* Dynamic ``image_input_shape`` is not supported: the input image will be resized to the static ``image_input_shape``. This means model output might not exactly match the huggingface implementation.
We are continuously improving user & developer experience for VLMs. Please raise an issue on GitHub if you have any feedback or feature requests.

Offline Batched Inference
-------------------------

Expand All @@ -31,7 +38,7 @@ To initialize a VLM, the aforementioned arguments must be passed to the ``LLM``
image_feature_size=576,
)
For now, we only support a single image per text prompt. To pass an image to the model, note the following in :class:`vllm.inputs.PromptStrictInputs`:
To pass an image to the model, note the following in :class:`vllm.inputs.PromptStrictInputs`:

* ``prompt``: The prompt should have a number of ``<image>`` tokens equal to ``image_feature_size``.
* ``multi_modal_data``: This should be an instance of :class:`~vllm.multimodal.image.ImagePixelData` or :class:`~vllm.multimodal.image.ImageFeatureData`.
Expand Down

0 comments on commit e6aa5ff

Please sign in to comment.