-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
37 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,38 @@ | ||
|
||
# ✨ MMComposition: Revisiting the Compositionality of Pre-trained Vision-Language Models | ||
[**🌐 Homepage**](https://hanghuacs.github.io/MMComposition/) | [**🔬 Paper**](https://arxiv.org/abs/2410.09733) | [**👩💻 Code**](https://github.com/hanghuacs/MMComposition_/blob/main/evaluation.py) | [**📊 Dataset**](https://github.com/hanghuacs/MMComposition_) | [**📈 Evaluation**](https://github.com/hanghuacs/MMComposition_) | [**🏆 Leaderboard**](https://hanghuacs.github.io/MMComposition/#leaderboard) | ||
|
||
## What is MMComposition? | ||
> MMComposition aims to provide a comprehensive assessment of compositionality for Vision-Language Models (VLMs) -- the ability to understand and produce novel combinations of known visual and textual components. This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and critical areas for model improvement in VLM. MMComposition comprises 13 complex vision-language composition tasks, including: | ||
- `Attribute Perception` | ||
- `Object Perception` | ||
- `Counting Perception` | ||
- `Relation Perception` | ||
- `Difference Spotting` | ||
- `Text Rendering` | ||
- `Visual Similarity` | ||
- `Attribute Reasoning` | ||
- `Object Reasoning` | ||
- `Counting Reasoning` | ||
- `Relation Reasoning` | ||
- `Object Interaction` | ||
- `Compositional Probing` | ||
|
||
## Getting Started | ||
|
||
## 🏆 Leaderboard | ||
|
||
## 📉 Statistics | ||
|
||
|
||
## ✏️ Citation | ||
```bibtex | ||
@article{hua2024mmcomposition, | ||
title={MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models}, | ||
author={Hua, Hang and Tang, Yunlong and Zeng, Ziyun and Cao, Liangliang and Yang, Zhengyuan and He, Hangfeng and Xu, Chenliang and Luo, Jiebo}, | ||
journal={arXiv preprint arXiv:2410.09733}, | ||
year={2024} | ||
} | ||
``` | ||
|
||
# Under construction... |