-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
Hi everyone,
I want to integrate the ViT from the Qwen2-VL model into the module. According to the documentation, it states:
"The LAVIS library includes a standard model module that builds the foundation for many major language-vision models such as ALBEF, BLIP, ALPRO, and CLIP."
Could anyone guide me on how to add this custom model (ViT of Qwen2-VL) to the module?
Do I need to implement the entire architecture and other components?
Metadata
Metadata
Assignees
Labels
No labels