Geo-R1

Setup

We use VLM-R1 as our main codebase (train and eval)

# install torch==2.6.0, cuda 12.4 version
conda create -n vlm-r1 python=3.10
conda activate vlm-r1
git clone https://github.com/om-ai-lab/VLM-R1.git
cd VLM-R1
bash setup.sh

We use Easy-R1 to train 7B, 32B models.

git clone https://github.com/hiyouga/EasyR1.git
cd EasyR1
pip install -e .

Model

Load or download model from Geo-R1 huggingface page
Naming Convention "Geo-R1-3B-GRPO-GRES-10shot"
- Geo-R1: Our model trained with RL-based post-training paradigm
- 3B: Model size
- GRPO: RL algorithm
- GRES: General Referring Expression Task
- 10shot: Number of few-shot samples

Dataset

Geo-R1 utilizes several public remote sensing datasets. We provide a series of preprocessing scripts to convert them into a unified format suitable for our training and evaluation pipelines, also you can download our processed dataset from repo and use it directly.

You need to download the images from the original dataset. We only provide the annotation due to the copyright issue.

Dataset Name	Task Type	Evaluation Script
EarthReason	FS-GRES	`test_gres_eval.py`
RRSIS-D	FS-GRES (Cross-Dataset Eval)	`test_gres_eval.py`
NWPU-VHR-10	FS-OVD	`test_ovd_nwpu_eval.py`
VRSBench	FS-REC	`test_rec_r1_eval.py`
DIOR-RSVG	FS-REC (Cross-Dataset Eval)	`test_rec_r1_eval.py`

Inference

Note 1: If you train your model in your env, you can directly use the following scripts. However, if you use the checkpoint from huggingface, you have to load the Processor from the original Qwen model.
MODEL_PATH = "/training/dapo-ckpt/Geo-R1-3B-GRPO-REC-10shot"
ORI_MODEL_PATH = "/training/model/Qwen2.5-VL-3B-Instruct"

# processor need to be loaded from original model
processor = AutoProcessor.from_pretrained(ORI_MODEL_PATH)
Note 2: If you encounter "to_dict()" error, please check the version of transformer package. Replace config.json using the one from orginal Qwen2.5-VL model might be a solution.

FS-REC Task

This task is a simplified version of GREC where the output is a single bounding box. The evaluation is based on IoU@0.5 and IoU@0.7. This is primarily tested on the VRSBench and DIOR-RSVG dataset. The evaluation results are typically generated in a rich JSON format for detailed analysis.

Example Evaluation Command (on VRSBench dataset):

# you need to change the model path and test files in script
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 test_rec_r1_eval.py
# analyze
python recal_rec_acc_unique.py

FS-OVD Task

This task evaluates the model's ability to detect all objects of a given class in an image. We use standard COCO mAP metrics for evaluation. This is primarily tested on the NWPU-VHR-10 dataset.

Example Evaluation Command (on NWPU dataset):

# Set CUDA devices, then run the evaluation script
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 test_ovd_nwpu_eval.py \
    --model_path /path/to/your/Geo-R1-3B-GRPO-OVD-10shot/checkpoint \
    --annotation_path /path/to/your/nwpu/annotations_test_set_new.json \
    --image_root /path/to/your/nwpu/positive_image_set \
    --exist_cat_path /path/to/your/nwpu/nwpu_exist_cat.json \
    --output_dir ./eval_results/nwpu_test_result

FS-GRES Task

This task evaluates the model's ability to produce a segmentation mask for a given textual description, using SAM as a tool. We use gIoU (mean IoU) as the primary metrics.

Example Evaluation Command (on EarthReason or RRSIS-D dataset):

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun --nproc_per_node=8 test_gres_eval.py \
    --model_path /path/to/your/Geo-R1-3B-GRPO-GRES-10shot/checkpoint \
    --data_path /path/to/your/test_earthreason_final.jsonl \
    --output_dir ./eval_results/earthreason_test_result

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Geo-R1

Achievements

Achievements

Block or report Geo-R1

Geo-R1

Setup

Model

Dataset

Inference

FS-REC Task

FS-OVD Task

FS-GRES Task

Popular repositories Loading

Uh oh!