TensorFCN

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCN-8 in particular) based on the code written by shekkizh and modified to be used with ease for any given task.

The model can be applied on the Scene Parsing Challenge dataset provided by MIT straightaway after cloning this repo. For a pretrained model checkout this link, it is not very precise but that's the only one I could get due to my lack of computing resources.

Prerequisites

numpy
scipy
opencv (both 2.4.x and 3.x should work)
Tested only with tensorflow 1.1.0 and python 2.7.12 on Ubuntu 16.04. I tried to make this python 3 compatible but I haven't checked yet if it works.

Differences

As pointed out frequently in the issue tracker of FCN.tensorflow there were some discrepancies between the caffe and the tensorflow implementation. Here are the main ones and how I handled them:

Conv6 padding ¹

In the original implementation there was no padding in order to shrink the tensor down to [batch_size, 1, 1, 4096]. This works well when the input images are 224x224 but for any other resolution it will break the deconvolution phase. Since the padding did no harm and the results were still acceptable I decided to leave it as it is.

Average Pooling or Max Pooling ²

I just sticked to the original implementation using max pooling.

Final layer of VGG ³

I just sticked to the original implementation using the relu'd layer.

VGG19 or VGG16

The original implementation used VGG16, shekkizh used VGG19. I leave you with the freedom of choice.

Usage

example.py should be self-explanatory for basic usage. Note that a trained model can be run also on arbitrary sized images that will be accordingly padded to avoid information loss during pooling.

Things you can set while setting up the network and training phase:

Number of classes
Validation Set (if you want to keep track of its loss)
Learning Rate
Keep Probability (1 - Dropout) for some layers
Training loss summary frequency
Validation loss summary frequency
Model saving frequency
Maximum number of steps
Choosing between VGG19 or VGG16 (and maybe others if implemented) as encoders.

Things you can tweak in the code:

Optimizer - default is Adam.
Loss function
Number of models to keep saved during training.

Scene Parsing Dataset / ADEChallenge2016

get_ade_dataset.sh is a simple script to download, verify and extract the whole dataset. It also changes the file/directory structure in order to be more coherent and compatible with the ADE_Dataset class.

Datasets

In dataset_reader there are two classes, BatchDataset which is supposed to be an abstract class, and ADE_Dataset which is an example on how to specialize BatchDataset and is ready to be used for training. Basic usage of a subclass:

dt = MyDataset(*args, **kwargs)
images, annotations, weights, names = dt.next_batch()

where images, annotations and weights are numpy arrays of shape [batch_size, height, width, channels] (3 channels for images and 1 for both annotations and weights). In ADE_Dataset weights are not used but for other tasks with different datasets they might be useful.

When subclassing BatchDataset there are things to keep in mind:

the argument names is required to identify its elements, but it might not be mandatory to be passed by the user to the subclass unless you want to be able to specify a subset of the dataset (as I did with ADE_Dataset when creating the validation set)
the argument image_op is often required to perform cropping and resizing when handling batch sizes greater than 1 unless you have a homogeneous dataset
image_size has to be specified only if batch_size is greater than 1

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
tensor_fcn		tensor_fcn
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
adechallenge.md5		adechallenge.md5
examples.py		examples.py
get_ade_dataset.sh		get_ade_dataset.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TensorFCN

Table of contents

Prerequisites

Differences

Conv6 padding ¹

Average Pooling or Max Pooling ²

Final layer of VGG ³

VGG19 or VGG16

Usage

Scene Parsing Dataset / ADEChallenge2016

Datasets

About

Releases

Packages

Languages

License

dubvulture/tensor_fcn

Folders and files

Latest commit

History

Repository files navigation

TensorFCN

Table of contents

Prerequisites

Differences

Conv6 padding 1

Average Pooling or Max Pooling 2

Final layer of VGG 3

VGG19 or VGG16

Usage

Scene Parsing Dataset / ADEChallenge2016

Datasets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Conv6 padding ¹

Average Pooling or Max Pooling ²

Final layer of VGG ³

Packages