Skip to content

hanghuacs/MMComposition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 

Repository files navigation

✨ MMComposition: Revisiting the Compositionality of Pre-trained Vision-Language Models

🌐 Homepage | 🔬 Paper | 👩‍💻 Code | 📊 Dataset | 📈 Evaluation | 🏆 Leaderboard

What is MMComposition?

MMComposition aims to provide a comprehensive assessment of compositionality for Vision-Language Models (VLMs) -- the ability to understand and produce novel combinations of known visual and textual components. This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and critical areas for model improvement in VLM. MMComposition comprises 13 complex vision-language composition tasks, including:

  • Attribute Perception
  • Object Perception
  • Counting Perception
  • Relation Perception
  • Difference Spotting
  • Text Rendering
  • Visual Similarity
  • Attribute Reasoning
  • Object Reasoning
  • Counting Reasoning
  • Relation Reasoning
  • Object Interaction
  • Compositional Probing

Getting Started

🏆 Leaderboard

Link

📉 Statistics

Link

✏️ Citation

@article{hua2024mmcomposition,
  title={MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models},
  author={Hua, Hang and Tang, Yunlong and Zeng, Ziyun and Cao, Liangliang and Yang, Zhengyuan and He, Hangfeng and Xu, Chenliang and Luo, Jiebo},
  journal={arXiv preprint arXiv:2410.09733},
  year={2024}
}

Under construction...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages