Skip to content

tritolol/RoofSAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RoofSAM: Adapting the Segment Anything Model to Rooftop Classification in Aerial Images

ASILOMAR Conference on Signals, Systems, and Computers 2024 Paper

RoofSAM adapts the powerful Segment Anything Model (SAM) for the specialized task of classifying rooftop shapes in aerial imagery. By leveraging predefined building footprint polygons, RoofSAM uses point sampling to guide an adapted mask decoder, which then produces a point-wise roof shape class distribution. Final roof classifications are determined by majority voting over the sampled points.

RoofSAM Architecture

The accompanying paper presents experiments on point sampling strategies and investigates the effect of varying the number of sampling points per roof instance.


Table of Contents


Installation

RoofSAM requires:

  • Python: version >=3.10.8
  • PyTorch: version >=2.0.1
  • TorchVision: version >=0.15.2

Note: For optimal performance, install PyTorch and TorchVision with CUDA support. Follow the official PyTorch installation instructions to set up your environment.

Installing RoofSAM

You can install RoofSAM directly from GitHub:

pip install git+https://github.com/tritolol/RoofSAM

Or, to install locally:

git clone git@github.com:tritolol/RoofSAM.git
cd roofsam
pip install -e .

To include all optional dependencies (useful if you plan to build the dataset without Docker), run:

cd roofsam
pip install -e .[all]

Note: Ensure that wget, unzip, and ogr2ogr (from GDAL) are installed and available in your system PATH.

Building the Dataset

The experiments in the paper used a proprietary dataset from Wuppertal, Germany. To facilitate reproducibility, we provide a script that builds a comparable dataset using aerial imagery from the publicly available North-Rhine Westphalia geo data portal. This dataset offers imagery at resolutions up to 10cm/pixel—matching the quality and resolution used in the paper.

The dataset creation script:

  • Downloads building cadastre data.
  • Filters for roof categories.
  • Queries the Web Coverage Service (WCS) for aerial images at specified locations.

Since the script relies on some uncommon system libraries (e.g., GDAL), we provide a Docker image to simplify the setup.

Building the Dataset with Docker

  1. Create a Dataset Directory:

    mkdir dataset
  2. Choose One of the Following Methods:

    Method 1: Use the Pre-built Docker Hub Image

    Run the container with the pre-built image tritolol/roofsam-dataset:

    docker run --rm --mount type=bind,src=./dataset,dst=/dataset tritolol/roofsam-dataset /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --output-dir /dataset

    Method 2: Build the Docker Image Locally

    Build the Docker image:

    docker build -t dataset_builder tools/build_alkis_dataset
    docker run --rm --mount type=bind,src=./dataset,dst=/dataset dataset_builder /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --output-dir /dataset

For additional configuration options, you can view the help message:

docker run --rm dataset_builder /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --help

Model Checkpoints

RoofSAM is comprised of two key components:

  1. SAM Image Encoder: The encoder weights can be downloaded from the SAM repository. These weights are automatically downloaded when running the precomputation script.
  2. Adapted Mask Decoder: Pre-trained weights for the mask decoder are available in the checkpoints folder. For example, the checkpoint decoder_wuppertal_0.2.pt was trained using data from Wuppertal with a ground sampling resolution of 0.2m/pixel and 4 sampling points.

Tools

The repository provides several command-line tools located in the tools/ directory. These scripts are installed to your PATH during setup and can be configured via command-line arguments (use -h for help).

  • Embedding Precomputation:

    roofsam_precompute_embeddings_cuda.py

    Description: Precompute image embeddings using the SAM image encoder across one or multiple CUDA devices. These embeddings are required for both training and testing.

  • Training:

    roofsam_train.py

    Description: Train the RoofSAM model using the provided dataset. Requires precomputed image embeddings.

  • Testing:

    roofsam_test.py

    Description: Evaluate a trained model. Also requires precomputed image embeddings.

Usage example to see all available options for a tool:

roofsam_train.py -h

License

The repository is licensed under the Apache 2.0 license.

Citing RoofSAM

@INPROCEEDINGS{10942786,
  author={Bauer, Adrian and Krabbe, Jan-Christoph and Kollek, Kevin and Velten, Jörg and Kummert, Anton},
  booktitle={2024 58th Asilomar Conference on Signals, Systems, and Computers}, 
  title={RoofSam: Adapting the Segment Anything Model to Rooftop Classification in Aerial Images}, 
  year={2024},
  volume={},
  number={},
  pages={350-353},
  keywords={Image segmentation;Adaptation models;Visualization;Shape;Foundation models;Computational modeling;Urban planning;Training data;Data processing;Geospatial analysis;roof shape classification;aerial imagery analysis;semantic segmentation;geospatial data processing;foundation model adaptation;urban planning},
  doi={10.1109/IEEECONF60004.2024.10942786}}

Acknowledgements

SAM (Segment Anything) [bib]
@article{kirillov2023segany,
title={Segment Anything}, 
author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
journal={arXiv:2304.02643},
year={2023}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published