This is an initial repo for Deep Learning Project.
This page is no longer updating. For more information and results, go to FCN-GoogLeNet.
This is a Tensorflow Implementation of Fully Convolutional Networks for Semantic Segmentation, CVPR 2015.
To do things a bit differently, we would like to take a GoogLeNet (Inception v3) and do this.
There is also a Tensorflow implementation here: FCN.tensorflow.
This project would mostly based on these previous work.
uploaded modified inception_v3_fcn (generate the model) based on slim/nets/inception_v3.py
uploaded modified inception_FCN (training and visualize script) based on tensorflow.FCN/FCN.py and slim/train_image_classifier.py
uploaded inception_utils.py from slim/nets because I think it's needed
More will be uploaded later if needed
Possible work for next update
cleanup inception_v3_fcn: use slim/nets/inception_v3.py as much as possible and separate upsampling part
minor mod for inception_FCN: i dont know if it will work
added .gitignore to ignore data, model, and log folder
added .sh to note what params to use when run the inception_FCN
inception_FCN is our train/visualize script
when run it, it can take some params to specify mode, scope, and such
changed the default value of those params to run without specify them
used the whole inception_v3 for the model without space squeeze
(when it's written, it's already fully convolutional)
Bug: shape of upsampling is a problem
added a file that contain the shape for each layer for checking
selected arbitrary layer to do skip
Bug: never really train it but everything before that should be OK
tried to train full network but failed the current understanding is that saver.restore cannot handle missing variables for the old code to work: change trainable_scope to None (to train all) change logs_dir and train_dir to logs/all (push one level down) change checkpoint_path to logs these changes were based on train_classifier from slim this will restore the model using init_fn, ignoring the missing variables ran short tests with small dataset and no more errors fingers cross for PDC
TODO: these are actually flags try to see how to pass them with script current solution is to change the default value
Here is the presentation (slides)given by the authors of the original paper.
http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/
-
Step 1: reinterpret fully connected layer as conv layers with 1x1 output. (No weight changing)
-
Step 2: add conv layer at the very end to do upsample.
-
Step 3: put a pixelwise loss in the end
along the way we have stack of features. closer to the input - higher resolution - shallow, local - where closer to the output - lower resolution - deep, global - what
-
Step 4: skip to fuse layers. interpolate and sum.
-
Step 5: Fine tune on per-pixel dataset, PASCAL
I stopped at 8:30 in the video
http://cs231n.github.io/convolutional-networks/#convert
http://stackoverflow.com/questions/38536202/how-to-use-inception-v3-as-a-convolutional-network http://stackoverflow.com/questions/38565497/tensorflow-transfer-learning-implementation-semantic-segmentation
(I never thought that this could be a huge project. CIFAR10 conceived me.)
The first thing we need to ask is MS COCO or PASCAL?
This guy's Blog and his TensorFlow Image Segmentation can be useful.
Blog posts worth mentioning are: (some of this can also be found by the end of his project README)
Convert Classification Network to FCN
Another implemetation called Train Deeplab. They were not using TensorFlow but they were doing PASCAL with less classes. Could be useful. This could be developed as an extra feature. I don't know if this makes sense.
There is a Python API for MS COCO, functionality unknown.
This Tensorflow Annex thingy claims to do the conversion with no validation...
This Show and Tell example used MS COCO and did some convertion. But they include some "caption" as "labels". That's not us.