Skip to content

rssysu/AgroMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgroMind

A comprehensive agricultural remote sensing benchmark covering four task dimensions: Spatial Perception, Object Understanding, Scene Understanding, and Scene Reasoning, with a total of 13 task types, ranging from crop identification and health monitoring to environmental analysis.

License

🔗 Link

📂 Structure

AgroMind/
├── AgroMind/
│   ├── models/                 # LMMs    
│   ├── utils/      
│   └── eval.py     
├── QA/                   # Tasks(questions and answer pairs,13 types)
├── static/          
│   ├── css/         
│   ├── images/                  # data examples
│   └── js/          
├── .nojekyll    
├── conceptual.pdf               # Project poster
├── README.md                    # introduction
├── index.html                   # GitHub-Page
└── test.txt                     # just for testing

📌 Key Features

  • Multidimensional Evaluation

    • 🌍 Spatial Perception
    • 🔍 Object Understanding
    • 🏞️ Scene Understanding
    • 🤖 Scene Reasoning
  • Technical Specifications

    • 13 specialized agricultural tasks
    • Multimodal data support

Dataset/Benchmarks

Each JSON file contains questions of the same level-3 type, with items structured as follows:

{
    "image_path": "path/to/image",    // Image file path
    "type_id": question_format_type,  // Question response format
    "item_id": "id",                  // Question id in this file(Start with the number 1)
    "level1_id": "main_category",     // Top-level task dimension
    "level2_id": "sub_category",      // Task subtype
    "level3_id": "specific_task",     // Detailed task type
    "question": "query_text",         // Natural language question
    "options": ["A", ...],            // Answer choices (when applicable)
    "answer": "correct_response"      // Ground truth answer
}

Simply deploy the Hugging Face dataset locally in the ./images as this GitHub project and then unzip. You can then freely access items to obtain image paths and corresponding questions for model evaluation.

📜 Cite

@misc{li2025largemultimodalmodelsunderstand,
      title={Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind}, 
      author={Qingmei Li and Yang Zhang and Zurong Mai and Yuhang Chen and Shuohong Lou and Henglian Huang and Jiarui Zhang and Zhiwei Zhang and Yibin Wen and Weijia Li and Haohuan Fu and Jianxi Huang and Juepeng Zheng},
      year={2025},
      eprint={2505.12207},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.12207}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •