@@ -8,84 +8,109 @@ The following is the list of model architectures that are currently supported by
88Alongside each architecture, we include some popular models that use it.
99
1010.. list-table ::
11-   :widths:  25 25 50
11+   :widths:  25 25 50 5 
1212  :header-rows:  1
1313
1414  * - Architecture
1515    - Models
1616    - Example HuggingFace Models
17+     - :ref: `LoRA  <lora >`
1718  * - :code: `AquilaForCausalLM `
1819    - Aquila
1920    - :code: `BAAI/Aquila-7B `, :code: `BAAI/AquilaChat-7B `, etc.
21+     - ✅︎
2022  * - :code: `BaiChuanForCausalLM `
2123    - Baichuan
2224    - :code: `baichuan-inc/Baichuan2-13B-Chat `, :code: `baichuan-inc/Baichuan-7B `, etc.
25+     - 
2326  * - :code: `ChatGLMModel `
2427    - ChatGLM
2528    - :code: `THUDM/chatglm2-6b `, :code: `THUDM/chatglm3-6b `, etc.
29+     - 
2630  * - :code: `DeciLMForCausalLM `
2731    - DeciLM
2832    - :code: `Deci/DeciLM-7B `, :code: `Deci/DeciLM-7B-instruct `, etc.
33+     - 
2934  * - :code: `BloomForCausalLM `
3035    - BLOOM, BLOOMZ, BLOOMChat
3136    - :code: `bigscience/bloom `, :code: `bigscience/bloomz `, etc.
37+     - 
3238  * - :code: `FalconForCausalLM `
3339    - Falcon
3440    - :code: `tiiuae/falcon-7b `, :code: `tiiuae/falcon-40b `, :code: `tiiuae/falcon-rw-7b `, etc.
41+     - 
3542  * - :code: `GemmaForCausalLM `
3643    - Gemma
3744    - :code: `google/gemma-2b `, :code: `google/gemma-7b `, etc.
45+     - ✅︎
3846  * - :code: `GPT2LMHeadModel `
3947    - GPT-2
4048    - :code: `gpt2 `, :code: `gpt2-xl `, etc.
49+     - 
4150  * - :code: `GPTBigCodeForCausalLM `
4251    - StarCoder, SantaCoder, WizardCoder
4352    - :code: `bigcode/starcoder `, :code: `bigcode/gpt_bigcode-santacoder `, :code: `WizardLM/WizardCoder-15B-V1.0 `, etc.
53+     - 
4454  * - :code: `GPTJForCausalLM `
4555    - GPT-J
4656    - :code: `EleutherAI/gpt-j-6b `, :code: `nomic-ai/gpt4all-j `, etc.
57+     - 
4758  * - :code: `GPTNeoXForCausalLM `
4859    - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
4960    - :code: `EleutherAI/gpt-neox-20b `, :code: `EleutherAI/pythia-12b `, :code: `OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 `, :code: `databricks/dolly-v2-12b `, :code: `stabilityai/stablelm-tuned-alpha-7b `, etc.
61+     - 
5062  * - :code: `InternLMForCausalLM `
5163    - InternLM
5264    - :code: `internlm/internlm-7b `, :code: `internlm/internlm-chat-7b `, etc.
65+     - ✅︎
5366  * - :code: `InternLM2ForCausalLM `
5467    - InternLM2
5568    - :code: `internlm/internlm2-7b `, :code: `internlm/internlm2-chat-7b `, etc.
69+     - 
5670  * - :code: `LlamaForCausalLM `
5771    - LLaMA, LLaMA-2, Vicuna, Alpaca, Yi
5872    - :code: `meta-llama/Llama-2-13b-hf `, :code: `meta-llama/Llama-2-70b-hf `, :code: `openlm-research/open_llama_13b `, :code: `lmsys/vicuna-13b-v1.3 `, :code: `01-ai/Yi-6B `, :code: `01-ai/Yi-34B `, etc.
73+     - ✅︎
5974  * - :code: `MistralForCausalLM `
6075    - Mistral, Mistral-Instruct
6176    - :code: `mistralai/Mistral-7B-v0.1 `, :code: `mistralai/Mistral-7B-Instruct-v0.1 `, etc.
77+     - ✅︎
6278  * - :code: `MixtralForCausalLM `
6379    - Mixtral-8x7B, Mixtral-8x7B-Instruct
6480    - :code: `mistralai/Mixtral-8x7B-v0.1 `, :code: `mistralai/Mixtral-8x7B-Instruct-v0.1 `, etc.
81+     - ✅︎
6582  * - :code: `MPTForCausalLM `
6683    - MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter
6784    - :code: `mosaicml/mpt-7b `, :code: `mosaicml/mpt-7b-storywriter `, :code: `mosaicml/mpt-30b `, etc.
85+     - 
6886  * - :code: `OLMoForCausalLM `
6987    - OLMo
7088    - :code: `allenai/OLMo-1B `, :code: `allenai/OLMo-7B `, etc.
89+     - 
7190  * - :code: `OPTForCausalLM `
7291    - OPT, OPT-IML
7392    - :code: `facebook/opt-66b `, :code: `facebook/opt-iml-max-30b `, etc.
93+     - 
7494  * - :code: `OrionForCausalLM `
7595    - Orion
7696    - :code: `OrionStarAI/Orion-14B-Base `, :code: `OrionStarAI/Orion-14B-Chat `, etc.
97+     - 
7798  * - :code: `PhiForCausalLM `
7899    - Phi
79100    - :code: `microsoft/phi-1_5 `, :code: `microsoft/phi-2 `, etc.
101+     - 
80102  * - :code: `QWenLMHeadModel `
81103    - Qwen
82104    - :code: `Qwen/Qwen-7B `, :code: `Qwen/Qwen-7B-Chat `, etc.
105+     - 
83106  * - :code: `Qwen2ForCausalLM `
84107    - Qwen2
85108    - :code: `Qwen/Qwen2-beta-7B `, :code: `Qwen/Qwen2-beta-7B-Chat `, etc.
109+     - ✅︎
86110  * - :code: `StableLmForCausalLM `
87111    - StableLM
88112    - :code: `stabilityai/stablelm-3b-4e1t/ ` , :code: `stabilityai/stablelm-base-alpha-7b-v2 `, etc.
113+     - 
89114
90115If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
91116Otherwise, please refer to :ref: `Adding a New Model  <adding_a_new_model >` for instructions on how to implement support for your model.
0 commit comments