Skip to content

Add Qwen2VL's ViT #787

@parth1313

Description

@parth1313

Hi everyone,

I want to integrate the ViT from the Qwen2-VL model into the module. According to the documentation, it states:
"The LAVIS library includes a standard model module that builds the foundation for many major language-vision models such as ALBEF, BLIP, ALPRO, and CLIP."

Could anyone guide me on how to add this custom model (ViT of Qwen2-VL) to the module?
Do I need to implement the entire architecture and other components?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions