Help regarding Chipped Datasets #2063

mthsdiniz-usp · 2024-02-21T13:54:55Z

mthsdiniz-usp
Feb 21, 2024

Currently i'm working with a chipped dataset of sentinel-2 data of 256x256 and 17 bands.

My data is organized with the imagery with shape [17,256,256] and labels with shape [256,256].

My goal is to build a semantic segmentation algorithm to distinguish between background and foreground, being the foreground class very unbalanced.

Whenever i'm training my dataset, it seems the model is not being able to pickup on my foreground class, since my train_loss and val_loss are nans and my foreground_precision and foreground_recall are 0.

Is this code right for working with chipped datasets?

import torch
import albumentations as A
from rastervision.core.data import ClassConfig
from rastervision.pytorch_learner import SemanticSegmentationImageDataset
from rastervision.pytorch_learner import SemanticSegmentationGeoDataConfig, SemanticSegmentationImageDataConfig
from rastervision.pytorch_learner import SolverConfig
from rastervision.pytorch_learner import SemanticSegmentationLearnerConfig
from rastervision.pytorch_learner import SemanticSegmentationLearner

class_config = ClassConfig(
    names=['background', 'foreground'],
    colors=['lightgray', 'darkred'],
    null_class='background')

train_ds = SemanticSegmentationImageDataset(
    img_dir='../rasters/train-images/',
    label_dir='../rasters/train-labels/',
    transform=A.Resize(256, 256),
    
)

val_ds = SemanticSegmentationImageDataset(
    img_dir='../rasters/test-images/',
    label_dir='../rasters/test-labels/',
    transform=A.Resize(256, 256),
    
)

model = torch.hub.load(
    'AdeelH/pytorch-fpn:0.3',
    'make_fpn_resnet',
    name='resnet18',
    fpn_type='panoptic',
    num_classes=len(class_config),
    fpn_channels=128,
    in_channels=17,
    out_size=(256, 256),
    pretrained=True)


data_cfg = SemanticSegmentationImageDataConfig(
    class_names=class_config.names,
    class_colors=class_config.colors,
    num_workers=10, # increase to use multi-processing
    img_channels=17
)

solver_cfg = SolverConfig(
    batch_sz=8,
    lr=0.1,
    class_loss_weights=[1., 100.]
)

learner_cfg = SemanticSegmentationLearnerConfig(data=data_cfg, solver=solver_cfg)

learner = SemanticSegmentationLearner(
    cfg=learner_cfg,
    output_dir='./argentina/',
    model=model,
    train_ds=train_ds,
    valid_ds=val_ds,
)

Answered by AdeelH

Feb 22, 2024

For the specification of the format, see here or here.

For usage with RV, see this tutorial.

View full answer

AdeelH · 2024-02-21T14:29:19Z

AdeelH
Feb 21, 2024
Maintainer

NaN loss usually means that the learning rate is too high. Try lr=1e-4 and see if that helps.

0 replies

mthsdiniz-usp · 2024-02-21T14:56:05Z

mthsdiniz-usp
Feb 21, 2024
Author

Tried with 1e-4 and 1e-6 but it didn't work.

Whenever i use a single image and label with the SemanticSegmentationRandomWindowGeoDataset, it seems the model is actually learning between the classes.

from rastervision.core.data import ClassConfig
from rastervision.core.data.utils import make_ss_scene
import albumentations as A
import torch

from rastervision.pytorch_learner import (
    SemanticSegmentationRandomWindowGeoDataset,
    SemanticSegmentationSlidingWindowGeoDataset,
    SemanticSegmentationVisualizer,
    SemanticSegmentationGeoDataConfig,
    SolverConfig,
    SemanticSegmentationLearnerConfig,
    SemanticSegmentationLearner)


train_image_uri  = '../rasters/train-images/S23W064-245_2022-01-01_12.tif'
train_label_uri = '../rasters/train-labels/S23W064-245_2022-01-01_12.tif'

val_image_uri  = '../rasters/test-images/S23W064-246_2022-01-01_2.tif'
val_label_uri = '../rasters/test-labels/S23W064-246_2022-01-01_2.tif'

class_config = ClassConfig(
    names=['background', 'foreground'],
    colors=['lightgray', 'darkred'],
    null_class='background')


data_augmentation_transform = A.Compose([
    A.Flip(),
    A.ShiftScaleRotate(),
    A.OneOf([
        A.HueSaturationValue(hue_shift_limit=10),
        A.RGBShift(),
        A.ToGray(),
        A.ToSepia(),
        A.RandomBrightness(),
        A.RandomGamma(),
    ]),
    A.CoarseDropout(max_height=32, max_width=32, max_holes=5)
])

train_ds = SemanticSegmentationRandomWindowGeoDataset.from_uris(
    class_config=class_config,
    image_uri=train_image_uri,
    label_raster_uri=train_label_uri,
    size_lims=(150, 200),
    out_size=256,
    max_windows=256)

model = torch.hub.load(
    'AdeelH/pytorch-fpn:0.3',
    'make_fpn_resnet',
    name='resnet18',
    fpn_type='panoptic',
    num_classes=len(class_config),
    fpn_channels=128,
    in_channels=3,
    out_size=(256, 256),
    pretrained=True)



data_cfg = SemanticSegmentationGeoDataConfig(
    class_names=class_config.names,
    class_colors=class_config.colors,
    num_workers=10, # increase to use multi-processing
)



solver_cfg = SolverConfig(
    batch_sz=8,
    lr=3e-2,
    class_loss_weights=[1., 10.]
)


learner_cfg = SemanticSegmentationLearnerConfig(data=data_cfg, solver=solver_cfg)

learner = SemanticSegmentationLearner(
    cfg=learner_cfg,
    output_dir='./train-2/',
    model=model,
    train_ds=train_ds,
    valid_ds=val_ds,
)

Do you think it might be related to using SemanticSegmentationImageDataset on the chipped datasets that might be causing the model to fail on picking up the foreground class?

1 reply

AdeelH Feb 21, 2024
Maintainer

When using the GeoDataset on a single chip, you're essentially overfitting to that one chip, so it makes sense that the model starts to learn something.
I think this is a pretty significant class imbalance and therefore it is not surprising that the model struggles. It probably won't work, but it would be interesting to see if bumping up the foreground loss weight to a 1000 or even 10,000 helps.
Given the sparsity of foreground instances it might make more sense to pose this as an object detection problem rather than a semantic segmentation problem.

mthsdiniz-usp · 2024-02-22T14:49:59Z

mthsdiniz-usp
Feb 22, 2024
Author

Hey AdeelH, makes sense! I tried bumping the loss weight but even with that it was not improving precision / recall for the silo bags class.

I'm trying to create a COCO dataset to work on the object detection approach. Would you have an example of this implementation with Raster Vision ?

1 reply

AdeelH Feb 22, 2024
Maintainer

For the specification of the format, see here or here.

For usage with RV, see this tutorial.

Answer selected by mthsdiniz-usp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help regarding Chipped Datasets #2063

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Help regarding Chipped Datasets #2063

mthsdiniz-usp Feb 21, 2024

Replies: 3 comments · 2 replies

AdeelH Feb 21, 2024 Maintainer

mthsdiniz-usp Feb 21, 2024 Author

AdeelH Feb 21, 2024 Maintainer

mthsdiniz-usp Feb 22, 2024 Author

AdeelH Feb 22, 2024 Maintainer

mthsdiniz-usp
Feb 21, 2024

Replies: 3 comments 2 replies

AdeelH
Feb 21, 2024
Maintainer

mthsdiniz-usp
Feb 21, 2024
Author

AdeelH Feb 21, 2024
Maintainer

mthsdiniz-usp
Feb 22, 2024
Author

AdeelH Feb 22, 2024
Maintainer