ASILOMAR Conference on Signals, Systems, and Computers 2024 Paper
RoofSAM adapts the powerful Segment Anything Model (SAM) for the specialized task of classifying rooftop shapes in aerial imagery. By leveraging predefined building footprint polygons, RoofSAM uses point sampling to guide an adapted mask decoder, which then produces a point-wise roof shape class distribution. Final roof classifications are determined by majority voting over the sampled points.
The accompanying paper presents experiments on point sampling strategies and investigates the effect of varying the number of sampling points per roof instance.
RoofSAM requires:
- Python: version
>=3.10.8 - PyTorch: version
>=2.0.1 - TorchVision: version
>=0.15.2
Note: For optimal performance, install PyTorch and TorchVision with CUDA support. Follow the official PyTorch installation instructions to set up your environment.
You can install RoofSAM directly from GitHub:
pip install git+https://github.com/tritolol/RoofSAMOr, to install locally:
git clone git@github.com:tritolol/RoofSAM.git
cd roofsam
pip install -e .To include all optional dependencies (useful if you plan to build the dataset without Docker), run:
cd roofsam
pip install -e .[all]Note: Ensure that wget, unzip, and ogr2ogr (from GDAL) are installed and available in your system PATH.
The experiments in the paper used a proprietary dataset from Wuppertal, Germany. To facilitate reproducibility, we provide a script that builds a comparable dataset using aerial imagery from the publicly available North-Rhine Westphalia geo data portal. This dataset offers imagery at resolutions up to 10cm/pixel—matching the quality and resolution used in the paper.
The dataset creation script:
- Downloads building cadastre data.
- Filters for roof categories.
- Queries the Web Coverage Service (WCS) for aerial images at specified locations.
Since the script relies on some uncommon system libraries (e.g., GDAL), we provide a Docker image to simplify the setup.
-
Create a Dataset Directory:
mkdir dataset
-
Choose One of the Following Methods:
Run the container with the pre-built image tritolol/roofsam-dataset:
docker run --rm --mount type=bind,src=./dataset,dst=/dataset tritolol/roofsam-dataset /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --output-dir /dataset
Build the Docker image:
docker build -t dataset_builder tools/build_alkis_dataset
docker run --rm --mount type=bind,src=./dataset,dst=/dataset dataset_builder /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --output-dir /dataset
For additional configuration options, you can view the help message:
docker run --rm dataset_builder /venv/bin/python /app/roofsam_build_alkis_roof_dataset_wcs.py --helpRoofSAM is comprised of two key components:
- SAM Image Encoder: The encoder weights can be downloaded from the SAM repository. These weights are automatically downloaded when running the precomputation script.
- Adapted Mask Decoder: Pre-trained weights for the mask decoder are available in the checkpoints folder. For example, the checkpoint decoder_wuppertal_0.2.pt was trained using data from Wuppertal with a ground sampling resolution of 0.2m/pixel and 4 sampling points.
The repository provides several command-line tools located in the tools/ directory. These scripts are installed to your PATH during setup and can be configured via command-line arguments (use -h for help).
-
Embedding Precomputation:
roofsam_precompute_embeddings_cuda.pyDescription: Precompute image embeddings using the SAM image encoder across one or multiple CUDA devices. These embeddings are required for both training and testing.
-
Training:
roofsam_train.pyDescription: Train the RoofSAM model using the provided dataset. Requires precomputed image embeddings.
-
Testing:
roofsam_test.pyDescription: Evaluate a trained model. Also requires precomputed image embeddings.
Usage example to see all available options for a tool:
roofsam_train.py -hThe repository is licensed under the Apache 2.0 license.
@INPROCEEDINGS{10942786,
author={Bauer, Adrian and Krabbe, Jan-Christoph and Kollek, Kevin and Velten, Jörg and Kummert, Anton},
booktitle={2024 58th Asilomar Conference on Signals, Systems, and Computers},
title={RoofSam: Adapting the Segment Anything Model to Rooftop Classification in Aerial Images},
year={2024},
volume={},
number={},
pages={350-353},
keywords={Image segmentation;Adaptation models;Visualization;Shape;Foundation models;Computational modeling;Urban planning;Training data;Data processing;Geospatial analysis;roof shape classification;aerial imagery analysis;semantic segmentation;geospatial data processing;foundation model adaptation;urban planning},
doi={10.1109/IEEECONF60004.2024.10942786}}SAM (Segment Anything) [bib]
@article{kirillov2023segany,
title={Segment Anything},
author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
journal={arXiv:2304.02643},
year={2023}
}