Skip to content

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks #6538

Closed
@ahundt

Description

@ahundt

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks

This issue is to develop an API design for dense prediction tasks such as Segmentation, which includes Fully Convolutional Networks (FCN), and was based on the discussion at #5228 (comment). The goal is to ensure Keras incorporates best practices by default for this sort of problem. Community input, volunteers, and implementations will be very welcome. #6655 is where preprocessing layers can be discussed.

Motivating Tasks and Datasets

Reference Materials

Feature Requests

These are ideas rather than a finalized proposal so input is welcome!

  • Input data: Support one or more Images as input + Supplemental data (ex: image + vector)
  • Augmentation of Input Data and Dense Labels
    • Example: Both image and label must be zoomed & translated equally in Pascal VOC
  • Input image dimensions should be able to vary
    • Ideally by height, width & number of channels
  • Loss function "2D" support, such as single and multi label results for each pixel in an image
  • class_weight support for dense labels
    • Example: Single class weight value for each class in an image segmentation task such as in Pascal VOC 2012.
  • Sparse to Dense Prediction weight transfer
  • Automatic Sparse to Dense Model conversion (advanced)
    • configuration at each downsampling stage
    • remove pooling layers and apply an equivalent atrous dilation in the next convolution layer
    • add an upsampling layer for each downsampling stage
  • SegmentationTop Layer?
    • Sigmoid single class predictions
    • Spatial Softmax argmax multi class predictions
    • Multi Label Predictions (sigmoid?)
  • "Upsample" Layer?
    • like "Activation" layer, where reasonable upsampling approaches can be defined with a simple string parameter
  • Example implementation training & testing on MSCOCO & Pascal VOC 2012 + extended berkeley labels
    • (advanced) pretrain pascal voc on coco then VOC
  • COCO pycocotools json format dataset support used by several datasets
    • supports multi-label segmentation, keypoint data, image descriptions, and more
  • TFRecord dataset support (probably TensorFlow only, maybe only in tensorflow implementation of keras)
  • flow_from_directory & Segmentation Data Generator
    • Keras-FCN,
    • Single class label support
    • Multi class label support
  • mean Intesection Over Union (mIOU) utility Keras-FCN
  • Image and label masks
  • Proper palette handling for png based labels
  • sparse label format for multi-label data?
  • debugging utilities
    • save predictions to file
  • Iterative training of partial networks at varying strides, as described in the FCN paper (advanced, may not be necessary as per Keras-FCN performance)

Existing Keras Utilities with compatible license

Questions

  • Is something as clear as 30 seconds to keras segmentation possible?
  • Is anything above missing, redundant, or out of date compared to the state of the art?
  • Should the current ImageDataGenerator be extended or is a separate class like Keras-FCN's SegDataGenerator clearer?
  • Should there be a guide of some sort?
  • What will make for useful training progress and debugging data? (sparse mIOU?, something else?)
  • What is needed to handle large datasets quickly and efficiently? (should this be out of scope?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions