The project is explained here
In this repository is implemented three architectures: VGG-16 + FCN-8 module, VGG-16 + FCN-4 module and U-Net.
The two models with VGG-16 use pre-trained weights from SSD300 model implemented here. Although the SSD300 is designed for object detection, its feature extractor can be reused in another task involving similar classes. The related article (link at the top of this readme) explains the implementation and compares training with and without transfer learning. It also describes how to parse raw data to train segmentation models.
- FCN-8 architecture and some visualizations:
- FCN-4 architecture and some visualizations:
- U-NET architecture and a visualization: paper
- Training: the notebooks UNET/FCN4/8_training.ipynb show how to train a UNET / VGG-16 + FCN-4 / VGG-16 + FCN8 models.
- Testing: the notebook infer_on_videos.ipynb shows how to infer the segmentation model VGG-16 + FCN-4 on a single image or on a video.
The script under utils/ folder allows to create the visualizations