[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
-
Updated
Nov 13, 2024 - Python
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Official repository for VisionZip (CVPR 2025)
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
An on-premises, OCR-free unstructured data extraction tool powered by vision language models.
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
[ICASSP 2024] The official repo for Harnessing the Power of Large Vision Language Models for Synthetic Image Detection
Code for VLM4Bio, a benchmark dataset of scientific question-answer pairs used to evaluate pretrained VLMs for trait discovery from biological images.
Official code for CVPR2025 "Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection"
BobVLM – A 1.5B multimodal model built from scratch and pre-trained on a single P100 GPU capable of image descriptions and moderate question answering. 🤗🎉
Official implementation of "Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models" (IEEE-RAS Humanoids 2024).
VLDBench: A large-scale benchmark for evaluating Vision-Language Models (VLMs) and Large Language Models (LLMs) on multimodal disinformation detection.
Auto labelling tool for Text, Image, Video
Align llava-v1.6-mistral-7b on RLAIF-V dataset using ORPO
Add a description, image, and links to the vlms topic page so that developers can more easily learn about it.
To associate your repository with the vlms topic, visit your repo's landing page and select "manage topics."