LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

This repository provides the official PyTorch source code for our paper:
LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery
🔗 Project page: https://lisat-bair.github.io/LISAt/

Authors:
Jerome Quenum*, Wen-Han Hsieh*, Tsung-Han Wu, Ritwik Gupta, Trevor Darrell, David M. Chan
(* equal contribution, UC Berkeley)

Introduction

Reading satellite images isn't just about identifying objects—it's about understanding their context, relationships, and sometimes even the absurdity of what humans ask AI to locate.

Enter LISAT, your AI-powered geospatial detective, trained to not only recognize but also reason about objects in satellite imagery. Whether it’s detecting urban expansion or identifying a suspiciously duck-shaped lake, LISAT delivers intelligent, nuanced segmentation and captioning from satellite views.

Trained on Two New Datasets:

GRES (Geospatial Reasoning Segmentation):
27,615 segmentation annotations over 9,205 images.
- Download on Hugging Face
- GitHub Repo
PreGRES:
A large-scale multimodal pretraining dataset with over 1 million QA pairs grounded in satellite imagery.

LISAT outperforms prior models like RS-GPT4V with:

+10.04% improvement in BLEU-4 (image captioning)
+143.36% improvement in gIoU (segmentation)

Status Update

2025-03-22: Released training, evaluation, demo scripts, pretrained checkpoints, and full datasets.

Installation Guide

System Requirements

OS: Linux
GPU: NVIDIA A100 recommended (for FlashAttention)
Python: 3.9

🔧 Environment Setup

# Step 1: Create Python environment
conda create -n lisat python=3.9
conda activate lisat

# Step 2: Install dependencies
pip install pybind11==2.11.1
# install torch, torchvision as best fit for your system
pip install -r requirements.txt
pip install flash-attn --no-build-isolation  # Required for FlashAttention

# Step 3: Install evaluation metrics for image-captioning, vqa
# install https://pypi.org/project/pycocoevalcap/

Model & Dataset Release

LISAT Models on Hugging Face

LISAT-7B is specifically trained for geospatial reasoning segmentation tasks. Below are gIoU & cIoU score of LISAT-7B.

Model Name	LMM	HG-ckpt URL	gIoU	cIoU
LISAt-7B	LISAT-PRE	jquenum/LISAt-7b	27.5	24.5

LISAT_PRE-7B is specifically trained for geospatial image-captioning & visual question answering tasks. Below are BLEU-4 score of LISAT_PRE-7B.

Model Name	HG-ckpt URL	UCM-Captions	NWPU-Captions	Sydney-Captions	Sydney-Captions
LISAT_PRE-7B	jquenum/LISAt_PRE-7B	72.3	65.8	54.2	36.1

RemoteCLIP is required for both LISAT-7B, LISAT_PRE-7B: wen-han/remote_clip_vit_l_14

Datasets

Visit our Dataset page for more details.

Training

bash train_lisat.sh [ReferSeg or ReasonSeg] [Deepspeed GPU Settings] [MASTERPORT]

# Example:
bash train_lisat.sh ReasonSeg localhost:0,1 15990

Merge LoRA Weights

bash merge_lora_weight.sh

Evaluation

Geospatial Segmentation Evaluation (gIoU, cIoU):

bash eval_lisat.sh

LISAT Inference for Image Captioning:

bash pred_lisat_vqa.py

Image Captioning Evaluation (BLEU, ROUGE-L, etc.)

cd eval_lisat_pre
bash eval_captioning.sh

Acknowledgements

LISAT builds upon foundational work from:

We thank the open-source community for its contributions.

Citation

If you use LISAT, its datasets, or any part of this repository in your work, please consider citing our paper:

@article{quenum2025lisat,
  title={LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery},
  author={Quenum, Jerome and Hsieh, Wen-Han and Wu, Tsung-Han and Gupta, Ritwik and Darrell, Trevor and Chan, David M},
  journal={arXiv preprint arXiv:2505.02829},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
dataloaders		dataloaders
dataset		dataset
eval_lisat_pre		eval_lisat_pre
model		model
src/pycocotools		src/pycocotools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
eval_lisat.py		eval_lisat.py
eval_lisat.sh		eval_lisat.sh
merge_lora_weight.sh		merge_lora_weight.sh
merge_lora_weights_and_save_hf_model.py		merge_lora_weights_and_save_hf_model.py
pred_lisat_vqa.py		pred_lisat_vqa.py
requirements.txt		requirements.txt
train_lisat.py		train_lisat.py
train_lisat.sh		train_lisat.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

Introduction

Trained on Two New Datasets:

Status Update

Installation Guide

System Requirements

🔧 Environment Setup

Model & Dataset Release

LISAT Models on Hugging Face

Datasets

Training

Merge LoRA Weights

Evaluation

Geospatial Segmentation Evaluation (gIoU, cIoU):

LISAT Inference for Image Captioning:

Image Captioning Evaluation (BLEU, ROUGE-L, etc.)

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

lisat-bair/LISAt_code

Folders and files

Latest commit

History

Repository files navigation

LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery

Introduction

Trained on Two New Datasets:

Status Update

Installation Guide

System Requirements

🔧 Environment Setup

Model & Dataset Release

LISAT Models on Hugging Face

Datasets

Training

Merge LoRA Weights

Evaluation

Geospatial Segmentation Evaluation (gIoU, cIoU):

LISAT Inference for Image Captioning:

Image Captioning Evaluation (BLEU, ROUGE-L, etc.)

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages