Add missing models to docs

vllm-project · DarkLight1337 · Oct 6, 2024 · Oct 5, 2024 · Oct 5, 2024 · Oct 5, 2024
commit 519f69500f77b2366a2ba83ca6b60708d04fdd40
diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst
@@ -7,10 +7,12 @@ vLLM supports a variety of generative Transformer models in `HuggingFace Transfo
 The following is the list of model architectures that are currently supported by vLLM.
 Alongside each architecture, we include some popular models that use it.
 
-----
+Language-only Models
+^^^^^^^^^^^^^^^^^^^^
+
+Text Generation
+---------------
 
-Decoder-only Language Models
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 .. list-table::
   :widths: 25 25 50 5 5
   :header-rows: 1
@@ -40,6 +42,11 @@ Decoder-only Language Models
     - :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
     -
     - ✅︎
+  * - :code:`BartForConditionalGeneration`
+    - ChatGLM
+    - :code:`facebook/bart-base`, :code:`facebook/bart-large-cnn`, etc.
+    - 
+    - 
   * - :code:`ChatGLMModel`
     - ChatGLM
     - :code:`THUDM/chatglm2-6b`, :code:`THUDM/chatglm3-6b`, etc.
@@ -259,11 +266,58 @@ Decoder-only Language Models
 .. note::
     Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
 
-.. _supported_vlms:
+Text Embedding
+--------------
+
+.. list-table::
+  :widths: 25 25 50 5 5
+  :header-rows: 1
+
+  * - Architecture
+    - Models
+    - Example HuggingFace Models
+    - :ref:`LoRA <lora>`
+    - :ref:`PP <distributed_serving>`
+  * - :code:`Gemma2Model`
+    - Gemma2-based
+    - :code:`BAAI/bge-multilingual-gemma2`, etc.
+    - 
+    - ✅︎
+  * - :code:`MistralModel`
+    - Mistral-based
+    - :code:`e5-mistral-7b-instruct`, etc.
+    - 
+    - ✅︎
+
+Reward Modelling
+----------------
+
+.. list-table::
+  :widths: 25 25 50 5 5
+  :header-rows: 1
+
+  * - Architecture
+    - Models
+    - Example HuggingFace Models
+    - :ref:`LoRA <lora>`
+    - :ref:`PP <distributed_serving>`
+  * - :code:`Qwen2ForRewardModel`
+    - Qwen2-based
+    - :code:`Qwen/Qwen2.5-Math-RM-72B`, etc.
+    - 
+    - ✅︎
+
+.. note::
+    As an interim measure, these models are supported via Embeddings API. See `this RFC <https://github.com/vllm-project/vllm/issues/8967>`_ for upcoming changes.
 
 Multimodal Language Models
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
+.. _supported_vlms:
+
+Vision-to-Text Generation
+-------------------------
+
 .. list-table::
   :widths: 25 25 25 25 5 5
   :header-rows: 1
@@ -364,6 +418,27 @@ Multimodal Language Models
     - :code:`Qwen/Qwen2-VL-2B-Instruct`, :code:`Qwen/Qwen2-VL-7B-Instruct`, :code:`Qwen/Qwen2-VL-72B-Instruct`, etc.
     -
     - ✅︎
+
+| :sup:`E` Pre-computed embeddings can be inputted for this modality.
+| :sup:`+` Multiple items can be inputted per text prompt for this modality.
+
+.. note::
+  For :code:`openbmb/MiniCPM-V-2`, the official repo doesn't work yet, so we need to use a fork (:code:`HwwwH/MiniCPM-V-2`) for now.
+  For more details, please see: https://github.com/vllm-project/vllm/pull/4087#issuecomment-2250397630
+
+Audio-to-Text Generation
+------------------------
+
+.. list-table::
+  :widths: 25 25 25 25 5 5
+  :header-rows: 1
+
+  * - Architecture
+    - Models
+    - Modalities
+    - Example HuggingFace Models
+    - :ref:`LoRA <lora>`
+    - :ref:`PP <distributed_serving>`
   * - :code:`UltravoxModel`
     - Ultravox
     - Audio\ :sup:`E+`
@@ -374,11 +449,6 @@ Multimodal Language Models
 | :sup:`E` Pre-computed embeddings can be inputted for this modality.
 | :sup:`+` Multiple items can be inputted per text prompt for this modality.
 
-.. note::
-  For :code:`openbmb/MiniCPM-V-2`, the official repo doesn't work yet, so we need to use a fork (:code:`HwwwH/MiniCPM-V-2`) for now.
-  For more details, please see: https://github.com/vllm-project/vllm/pull/4087#issuecomment-2250397630
-
-
 If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
 Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` and :ref:`Enabling Multimodal Inputs <enabling_multimodal_inputs>` 
 for instructions on how to implement support for your model.

diff --git a/docs/source/models/vlm.rst b/docs/source/models/vlm.rst
@@ -7,7 +7,7 @@ vLLM provides experimental support for Vision Language Models (VLMs). See the :r
 This document shows you how to run and serve these models using vLLM.
 
 .. important::
-    We are actively iterating on VLM support. Expect breaking changes to VLM usage and development in upcoming releases without prior deprecation.
+    We are actively iterating on VLM support. See `this RFC <https://github.com/vllm-project/vllm/issues/4194>`_ for upcoming changes.
 
     We are continuously improving user & developer experience for VLMs. Please `open an issue on GitHub <https://github.com/vllm-project/vllm/issues/new/choose>`_ if you have any feedback or feature requests.