Add Qwen2VL's ViT

Hi everyone,

I want to integrate the ViT from the Qwen2-VL model into the module. According to the [documentation](https://opensource.salesforce.com/LAVIS/latest/tutorial.models.html), it states:
"The LAVIS library includes a standard model module that builds the foundation for many major language-vision models such as [ALBEF](https://arxiv.org/pdf/2107.07651.pdf), [BLIP](https://arxiv.org/pdf/2201.12086.pdf), [ALPRO](https://arxiv.org/pdf/2112.09583.pdf), and [CLIP](https://arxiv.org/pdf/2103.00020.pdf)."

Could anyone guide me on how to add this custom model (ViT of Qwen2-VL) to the module?
Do I need to implement the entire architecture and other components?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Qwen2VL's ViT #787

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Qwen2VL's ViT #787

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions