[ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
-
Updated
Aug 8, 2025 - Python
[ICCVW 25] LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning
The official repo of our work "Pensieve: Retrospect-then-Compare mitigates Visual Hallucination"
FaceXBench: Evaluating Multimodal LLMs on Face Understanding
The Official Code Repo for EgoOrientBench [CVPR25]
Vision-Zephyr: a multimodal LLM for Visual Commonsense Reasoning—CLIP-ViT + Zephyr-7B with visual prompting; code, training scripts, and VCR evaluation.
Add a description, image, and links to the multimodal-llms topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-llms topic, visit your repo's landing page and select "manage topics."