Skip to content

Commit

Permalink
added dataset and updated readme/requirements
Browse files Browse the repository at this point in the history
  • Loading branch information
isaaccorley committed Aug 7, 2021
1 parent c1d9c13 commit 1f1b75a
Show file tree
Hide file tree
Showing 8 changed files with 109 additions and 6 deletions.
44 changes: 41 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# PyTorch Remote Sensing (torchrs)

(WIP) PyTorch implementation of popular datasets and models in remote sensing tasks (Change Detection, Image Super Resolution, Land Cover Classification/Segmentation, Image-to-Image Translation, Image Captioning, etc.) for various Optical (Sentinel-2, Landsat, etc.) and Synthetic Aperture Radar (SAR) (Sentinel-1) sensors.
(WIP) PyTorch implementation of popular datasets and models in remote sensing tasks (Change Detection, Image Super Resolution, Land Cover Classification/Segmentation, Image Captioning, Audio-visual recognition etc.) for various Optical (Sentinel-2, Landsat, etc.) and Synthetic Aperture Radar (SAR) (Sentinel-1) sensors.

## Installation

Expand Down Expand Up @@ -28,7 +28,8 @@ pip install 'git+https://github.com/isaaccorley/torchrs.git#egg=torch-rs[train]'

* [PROBA-V Multi-Image Super Resolution](https://github.com/isaaccorley/torchrs#proba-v-super-resolution)
* [ETCI 2021 Flood Detection](https://github.com/isaaccorley/torchrs#etci-2021-flood-detection)
* [FAIR1M Fine-grained Object Recognition](https://github.com/isaaccorley/torchrs#fair1m---fine-grained-object-recognition)
* [FAIR1M - Fine-grained Object Recognition](https://github.com/isaaccorley/torchrs#fair1m---fine-grained-object-recognition)
* [ADVANCE - Audiovisual Aerial Scene Recognition](https://github.com/isaaccorley/torchrs#advance---audiovisual-aerial-scene-recognition)
* [OSCD - Onera Satellite Change Detection](https://github.com/isaaccorley/torchrs#onera-satellite-change-detection-oscd)
* [S2Looking - Satellite Side-Looking Change Detection](https://github.com/isaaccorley/torchrs#satellite-side-looking-s2looking-change-detection)
* [LEVIR-CD+ - LEVIR Change Detection+](https://github.com/isaaccorley/torchrs#levir-change-detection-levir-cd)
Expand Down Expand Up @@ -110,7 +111,7 @@ x: dict(

<img src="./assets/fair1m.jpg" width="550px"></img>

The [FAIR1M](https://rcdaudt.github.io/oscd/) dataset, proposed in ["FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery", Sun et al.](https://arxiv.org/abs/2103.05569) is a fine-grained object recognition/detection dataset of 15,000 high resolution (0.3-0.8m) RGB images taken by the [Gaogen (GF)](https://earth.esa.int/web/eoportal/satellite-missions/g/gaofen-1) satellites and extracted from [Google Earth](https://earth.google.com/web/). The dataset contains rotated bounding boxes for objects of 5 categories (ships, vehicles, airplanes, courts, and roads) and 37 sub-categories. This dataset is a part of the [ISPRS Benchmark on Object Detection in High-Resolution Satellite Images](http://gaofen-challenge.com/benchmark). Note that so far only a portion of the training dataset has been released for the challenge (1,732/15,000 images).
The [FAIR1M](http://gaofen-challenge.com/) dataset, proposed in ["FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery", Sun et al.](https://arxiv.org/abs/2103.05569) is a fine-grained object recognition/detection dataset of 15,000 high resolution (0.3-0.8m) RGB images taken by the [Gaogen (GF)](https://earth.esa.int/web/eoportal/satellite-missions/g/gaofen-1) satellites and extracted from [Google Earth](https://earth.google.com/web/). The dataset contains rotated bounding boxes for objects of 5 categories (ships, vehicles, airplanes, courts, and roads) and 37 sub-categories. This dataset is a part of the [ISPRS Benchmark on Object Detection in High-Resolution Satellite Images](http://gaofen-challenge.com/benchmark). Note that so far only a portion of the training dataset has been released for the challenge (1,732/15,000 images).

The dataset can be downloaded (8.7GB) using `scripts/download_fair1m.sh` and instantiated below:

Expand All @@ -137,6 +138,43 @@ where N is the number of objects in the image
"""
```

### ADVANCE - Audiovisual Aerial Scene Recognition

<img src="./assets/advance.png" width="700px"></img>

The [AuDio Visual Aerial sceNe reCognition datasEt (ADVANCE)](https://akchen.github.io/ADVANCE-DATASET/) dataset, proposed in ["Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition", Hu et al.](https://arxiv.org/abs/2005.08449) is a dataset composed of 5,075 pairs of geotagged audio recordings and 512x512 RGB images extracted from [FreeSound](https://freesound.org/browse/geotags/?c_lat=24&c_lon=20&z=2) and [Google Earth](https://earth.google.com/web/), respectively. The images are then labeled into 13 scene categories using [OpenStreetMap](https://www.openstreetmap.org/#map=5/38.007/-95.844).

The dataset can be downloaded (4.5GB) using `scripts/download_advance.sh` and instantiated below:

```python
import torchvision.transforms as T
from torchrs.datasets import ADVANCE

image_transform = T.Compose([T.ToTensor()])
audio_transform = T.Compose([])

dataset = ADVANCE(
root="path/to/dataset/",
image_transform=image_transform,
audio_transform=audio_transform,
)

x = dataset[0]
"""
x: dict(
image: (3, 512, 512)
audio: (1, 220500)
cls: int
)
"""

dataset.classes
"""
['airport', 'beach', 'bridge', 'farmland', 'forest', 'grassland', 'harbour', 'lake',
'orchard', 'residential', 'sparse shrub land', 'sports land', 'train station']
"""
```

### Onera Satellite Change Detection (OSCD)

<img src="./assets/oscd.png" width="750px"></img>
Expand Down
Binary file added assets/advance.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
torch
torchvision
torchaudio
einops
numpy
pillow
Expand Down
7 changes: 7 additions & 0 deletions scripts/download_advance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
mkdir -p .data/advance
wget --no-check-certificate https://zenodo.org/record/3828124/files/ADVANCE_vision.zip?download=1 -O ADVANCE_vision.zip
wget --no-check-certificate https://zenodo.org/record/3828124/files/ADVANCE_sound.zip?download=1 -O ADVANCE_sound.zip
unzip ADVANCE_vision.zip -d .data/advance/
rm ADVANCE_vision.zip
unzip ADVANCE_sound.zip -d .data/advance/
rm ADVANCE_sound.zip
3 changes: 2 additions & 1 deletion torchrs/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@
from .sydney_captions import SydneyCaptions
from .ucm_captions import UCMCaptions
from .s2mtcp import S2MTCP
from .advance import ADVANCE


__all__ = [
"PROBAV", "ETCI2021", "RSVQALR", "RSVQAxBEN", "EuroSATRGB", "EuroSATMS",
"RESISC45", "RSICD", "OSCD", "S2Looking", "LEVIRCDPlus", "FAIR1M",
"SydneyCaptions", "UCMCaptions", "S2MTCP"
"SydneyCaptions", "UCMCaptions", "S2MTCP", "ADVANCE"
]
55 changes: 55 additions & 0 deletions torchrs/datasets/advance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import os
from glob import glob
from typing import List, Dict

import torch
import torchaudio
import numpy as np
import torchvision.transforms as T
from PIL import Image


class ADVANCE(torch.utils.data.Dataset):
""" AuDio Visual Aerial sceNe reCognition datasEt (ADVANCE) from
'Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition', Hu et al. (2020)
https://arxiv.org/abs/2005.08449
'We create an annotated dataset consisting of 5075 geotagged aerial imagesound pairs
involving 13 scene classes. This dataset covers a large variety of scenes from across
the world'
"""
def __init__(
self,
root: str = ".data/advance",
image_transform: T.Compose = T.Compose([T.ToTensor()]),
audio_transform: T.Compose = T.Compose([]),
):
self.root = root
self.image_transform = image_transform
self.audio_transform = audio_transform
self.files = self.load_files(root)
self.classes = sorted(set(f["cls"] for f in self.files))

@staticmethod
def load_files(root: str) -> List[Dict]:
images = sorted(glob(os.path.join(root, "vision", "**", "*.jpg")))
wavs = sorted(glob(os.path.join(root, "sound", "**", "*.wav")))
labels = [image.split(os.sep)[-2] for image in images]
files = [dict(image=image, audio=wav, cls=label) for image, wav, label in zip(images, wavs, labels)]
return files

def __len__(self) -> int:
return len(self.files)

def __getitem__(self, idx: int) -> Dict:
""" Returns a dict containing image, audio, and class label
image: (3, 512, 512)
audio: (1, 220500)
cls: int
"""
files = self.files[idx]
image = np.array(Image.open(files["image"]).convert("RGB"))
audio, fs = torchaudio.load(files["audio"])
image = self.image_transform(image)
audio = self.audio_transform(audio)
return dict(image=image, audio=audio, cls=files["cls"])
2 changes: 1 addition & 1 deletion torchrs/datasets/s2mtcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def __init__(
self.files = self.load_files(self.root)

@staticmethod
def load_files(root: str) -> List[Dict]:
def load_files(root: str) -> List[Dict]:
files = glob(os.path.join(root, "*.npy"))
files = [os.path.basename(f).split("_")[0] for f in files]
files = sorted(set(files), key=int)
Expand Down
3 changes: 2 additions & 1 deletion torchrs/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ def __call__(self, x: np.ndarray) -> torch.Tensor:
if x.dtype == "uint16":
x = x.astype("int32")

x = torch.from_numpy(x)
if isinstance(x, np.ndarray):
x = torch.from_numpy(x)

if x.ndim == 2:
if self.permute_dims:
Expand Down

0 comments on commit 1f1b75a

Please sign in to comment.