[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".
-
Updated
Jun 16, 2025 - Python
[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".
NoteMR enhances multimodal large language models for visual question answering by integrating structured notes. This implementation aims to reduce reasoning errors and improve visual feature perception. 🐙📚
Add a description, image, and links to the knowledge-based-visual-question-answering topic page so that developers can more easily learn about it.
To associate your repository with the knowledge-based-visual-question-answering topic, visit your repo's landing page and select "manage topics."