AgroMind

A comprehensive agricultural remote sensing benchmark covering four task dimensions: Spatial Perception, Object Understanding, Scene Understanding, and Scene Reasoning, with a total of 13 task types, ranging from crop identification and health monitoring to environmental analysis.

🔗 Link

GitHub Pages: https://rssysu.github.io/AgroMind/
Paper(arxiv): https://arxiv.org/abs/2505.12207
Dataset: https://huggingface.co/datasets/AgroMind/AgroMind
Code: https://github.com/rssysu/AgroMind

📂 Structure

AgroMind/
├── AgroMind/
│   ├── models/                 # LMMs    
│   ├── utils/      
│   └── eval.py     
├── QA/                   # Tasks(questions and answer pairs,13 types)
├── static/          
│   ├── css/         
│   ├── images/                  # data examples
│   └── js/          
├── .nojekyll    
├── conceptual.pdf               # Project poster
├── README.md                    # introduction
├── index.html                   # GitHub-Page
└── test.txt                     # just for testing

📌 Key Features

Multidimensional Evaluation
- 🌍 Spatial Perception
- 🔍 Object Understanding
- 🏞️ Scene Understanding
- 🤖 Scene Reasoning
Technical Specifications
- 13 specialized agricultural tasks
- Multimodal data support

Dataset/Benchmarks

Each JSON file contains questions of the same level-3 type, with items structured as follows:

{
    "image_path": "path/to/image",    // Image file path
    "type_id": question_format_type,  // Question response format
    "item_id": "id",                  // Question id in this file(Start with the number 1)
    "level1_id": "main_category",     // Top-level task dimension
    "level2_id": "sub_category",      // Task subtype
    "level3_id": "specific_task",     // Detailed task type
    "question": "query_text",         // Natural language question
    "options": ["A", ...],            // Answer choices (when applicable)
    "answer": "correct_response"      // Ground truth answer
}

Simply deploy the Hugging Face dataset locally in the ./images as this GitHub project and then unzip. You can then freely access items to obtain image paths and corresponding questions for model evaluation.

📜 Cite

@misc{li2025largemultimodalmodelsunderstand,
      title={Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind}, 
      author={Qingmei Li and Yang Zhang and Zurong Mai and Yuhang Chen and Shuohong Lou and Henglian Huang and Jiarui Zhang and Zhiwei Zhang and Yibin Wen and Weijia Li and Haohuan Fu and Jianxi Huang and Juepeng Zheng},
      year={2025},
      eprint={2505.12207},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.12207}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgroMind

🔗 Link

📂 Structure

📌 Key Features

Dataset/Benchmarks

📜 Cite

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
AgroMind		AgroMind
QA		QA
images		images
static		static
.nojekyll		.nojekyll
README.md		README.md
conceptual.pdf		conceptual.pdf
index.html		index.html
test.txt		test.txt

rssysu/AgroMind

Folders and files

Latest commit

History

Repository files navigation

AgroMind

🔗 Link

📂 Structure

📌 Key Features

Dataset/Benchmarks

📜 Cite

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages