Skip to content

[Feature]: Pipeline Parallelism support for the Vision Language Models #7684

Closed
@Manikandan-Thangaraj-ZS0321

Description

🚀 The feature, motivation and pitch

If I am not wrong, currently vllm supports only the Language models not the Vision models.

NotImplementedError: Pipeline parallelism is only supported for the following architectures: ['AquilaModel', 'AquilaForCausalLM', 'DeepseekV2ForCausalLM', 'InternLMForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LLaMAForCausalLM', 'MistralForCausalLM', 'Phi3ForCausalLM', 'GPT2LMHeadModel', 'MixtralForCausalLM', 'NemotronForCausalLM', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'QWenLMHeadModel'].

This feature would greatly benefit teams and projects working with vision-language models, allowing them to scale out their workloads efficiently and maintain performance as model sizes continue to grow.

Also It would be greatly helpful, if someone can point me out on other possibilities for pipeline parallelism. Thanks in advance

Alternatives

No response

Additional context

No response

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions