Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks

# Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks

This issue is to develop an API design for dense prediction tasks such as Segmentation, which includes Fully Convolutional Networks (FCN), and was based on the discussion at https://github.com/fchollet/keras/pull/5228#issuecomment-299611150. The goal is to ensure Keras incorporates best practices by default for this sort of problem. Community input, volunteers, and implementations will be very welcome.  #6655 is where preprocessing layers can be discussed.


## Motivating Tasks and Datasets

  - [Pascal VOC 2012 Single Label Segmentation](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6)
  - [MSCOCO Multi Label Segmentation](http://mscoco.org/explore/?id=180169)
  - Unambiguously refer to a particular person or object in image with [refcoco](https://github.com/lichengunc/refer)
  - Reinforcement Learning with [OpenAI Gym](https://gym.openai.com/)
  - [mscoco.org/external](http://mscoco.org/external/) has additional examples
     
## Reference Materials

  - [Fully Convolutional Networks for Semantic Segmentation](https://arxiv.org/abs/1605.06211)
  - [U-Net](https://arxiv.org/abs/1505.04597)
  - [Multi-Scale Context Aggregation by Dilated Convolutions](https://arxiv.org/abs/1511.07122)
  - [ResNet](http://arxiv.org/abs/1512.03385) and [Resnetv2](https://arxiv.org/abs/1603.05027)
  - [Wider or Deeper: Revisiting the ResNet Model for Visual Recognition](https://arxiv.org/abs/1611.10080)
  - [Fully Convolutional DenseNets](https://arxiv.org/abs/1611.09326)
  - Daniil's Blog (highly detailed, but for tensorflow)
      - [FCN for Image Segmentation](http://warmspringwinds.github.io/tensorflow/tf-slim/2017/01/23/fully-convolutional-networks-(fcns)-for-image-segmentation/)
      - [Image Segmentation with Tensorflow using CNNs and Conditional Random Fields](http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/18/image-segmentation-with-tensorflow-using-cnns-and-conditional-random-fields/)
      - [Upsampling and Image Segmentation with Tensorflow and TF-Slim](http://warmspringwinds.github.io/tensorflow/tf-slim/2016/11/22/upsampling-and-image-segmentation-with-tensorflow-and-tf-slim/)
      - [tf-image-segmentation](https://github.com/warmspringwinds/tf-image-segmentation) companion repository (tf only)
      

## Feature Requests

These are ideas rather than a finalized proposal so input is welcome!

 - Input data: Support one or more Images as input + Supplemental data (ex: image + vector)
 - Augmentation of Input Data and Dense Labels
    - Example: Both image and label must be zoomed & translated equally in Pascal VOC
 - Input image dimensions should be able to vary
    - Ideally by height, width & number of channels
 - Loss function "2D" support, such as single and multi label results for each pixel in an image
 - [class_weight](https://keras.io/models/sequential/) support for dense labels
    - Example: Single class weight value for each class in an image segmentation task such as in Pascal VOC 2012.
 - Sparse to Dense Prediction weight transfer
    - [Conversion of ImageNet weights from pre-trained models](https://github.com/tensorflow/models/tree/master/slim#pre-trained-models) for segmentation tasks
    - [Keras-FCN example](https://github.com/aurora95/Keras-FCN/blob/master/utils/transfer_FCN.py)
    - [Locking of batch normalization layers](https://github.com/tensorflow/tensorflow/issues/1122), often used during transfer process
 - Automatic Sparse to Dense Model conversion (advanced)
   - configuration at each downsampling stage
   - remove pooling layers and apply an equivalent atrous dilation in the next convolution layer
   - add an upsampling layer for each downsampling stage
 - SegmentationTop Layer?
   - Sigmoid single class predictions
   - Spatial Softmax argmax multi class predictions
   - Multi Label Predictions (sigmoid?)
 - "Upsample" Layer?
   - like "Activation" layer, where reasonable upsampling approaches can be defined with a simple string parameter
 - Example implementation training & testing on [MSCOCO](https://github.com/farizrahman4u/keras-contrib/pull/81) & [Pascal VOC 2012 + extended berkeley labels](https://github.com/farizrahman4u/keras-contrib/pull/80)
    - (advanced) pretrain pascal voc on coco then VOC
 - COCO [pycocotools](https://github.com/pdollar/coco/tree/master/PythonAPI/pycocotools) json format dataset support [used by several datasets](mscoco.org/external/)
    - supports multi-label segmentation, keypoint data, image descriptions, and more
 - TFRecord dataset support (probably TensorFlow only, maybe only in tensorflow implementation of keras)
 - flow_from_directory & Segmentation Data Generator
   - [Keras-FCN](https://github.com/ahundt/Keras-FCN/blob/master/utils/SegDataGenerator.py), 
   - Single class label support
   - Multi class label support
 - mean Intesection Over Union (mIOU) utility [Keras-FCN](https://github.com/ahundt/Keras-FCN/blob/master/evaluate.py) 
 - Image and label masks
 - Proper [palette handling for png based labels](https://github.com/nicolov/segmentation_keras/issues/14)
 - sparse label format for multi-label data?
 - debugging utilities
     - save predictions to file
 - Iterative training of partial networks at varying strides, as described in the FCN paper (advanced, may not be necessary as per Keras-FCN performance)
 
## Existing Keras Utilities with compatible license

 - [keras-contrib](https://github.com/farizrahman4u/keras-contrib) has:
     - [DensenetFCN](https://github.com/farizrahman4u/keras-contrib/blob/master/keras_contrib/applications/densenet.py) implementation
     - [MSCOCO](https://github.com/farizrahman4u/keras-contrib/pull/81)
     - [Pascal VOC 2012 + extended berkeley labels](https://github.com/farizrahman4u/keras-contrib/pull/80)
     - a couple of upsampling approaches
     - https://github.com/farizrahman4u/keras-contrib/issues/47 incorporating coco + voc 2012
 - [Keras-FCN](https://github.com/aurora95/Keras-FCN)
    - I've been working on this one, current basis for design suggestions
 - [segmentation_keras](https://github.com/nicolov/segmentation_keras)
     - includes example using caffe weight conversion utilties
     - fairly clean
 - [enet-keras](https://github.com/PavlosMelissinos/enet-keras)
     - includes work towards mscoco support
 - https://github.com/azavea/raster-vision/ 
    - is apache v2 compatible? I think so if keras is in tf now
 -  https://github.com/JihongJu/keras-fcn

## Questions

 - Is something as clear as [30 seconds to keras segmentation possible]((https://keras.io/#getting-started-30-seconds-to-keras))?
 - Is anything above missing, redundant, or out of date compared to the state of the art?
 - Should the current ImageDataGenerator be extended or is a separate class like [Keras-FCN's SegDataGenerator](https://github.com/ahundt/Keras-FCN/blob/master/utils/SegDataGenerator.py) clearer?
 - Should there be a guide of some sort?
 - What will make for useful training progress and debugging data? (sparse mIOU?, something else?)
 - What is needed to handle large datasets quickly and efficiently? (should this be out of scope?)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks #6538

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks

Motivating Tasks and Datasets

Reference Materials

Feature Requests

Existing Keras Utilities with compatible license

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks #6538

Description

Dense Prediction API Design, Including Segmentation and Fully Convolutional Networks

Motivating Tasks and Datasets

Reference Materials

Feature Requests

Existing Keras Utilities with compatible license

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions