doc updated

mnboos · mnboos · commit 3bf8a5900e30 · 2018-06-23T15:57:02.000+02:00
diff --git a/doc/chapters/challenges/challenges.tex b/doc/chapters/challenges/challenges.tex
@@ -3,17 +3,17 @@
 \chapter{Practical Challenges}\label{chp:practical_challenges}
 
 \section{Training data}
-When it comes to data sets that can be used when training the neural networks, there are two options. Either one uses an already available data set, either paid or free, or a new data set is created. Currently, machine learning and especially deep learning are popular topics. As a result of this, the availability of free and publicly available data sets has increased, especially in the area of image processing like segmentation, facial recognition or object detection. Free data sets consisting of aerial imagery are  \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
+When it comes to data sets that can be used to train the neural networks, there are two options. Either an already available data set is being used, either paid or free, or a new data set is created. Currently, machine learning and especially deep learning are popular topics. As a result of this, the availability of free and publicly available data sets has increased, especially in the area of image processing like segmentation, facial recognition or object detection. Free data sets consisting of aerial imagery are  \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
 
 Despite these available data sets, we decided to make our own, consisting solely of open data, that is Microsoft Bing for the imagery and OpenStreetMap for the vector data. Due to this, a tool named Airtiler \cite{airtiler} which is described in detail in \autoref{chp:theoretical_and_experimental_results}.
 
-It can be assumed, that in the future, more and more swiss cantons will made high resolution orthophotos publicly available. At the time of this writing, especially the canton of Zurich takes a pioneering role and makes several of their data sources publicly and freely available\footnote{https://geolion.zh.ch/ (15.06.18)}. However, at the time of this writing, it was not an option to use these images for this work, because it would lead to a rather small dataset.
+It can be assumed, that in the future, more and more swiss cantons will made high resolution orthophotos publicly available. At the time of this writing, especially the canton of Zurich takes a pioneering role and makes several of their data sources publicly and freely available\footnote{https://geolion.zh.ch/ (15.06.18)}. However, at the time of this writing, it was not an option to use these images for this work, because it would lead to a rather small data set.
 
 \section{Prediction accuracy}
 \subsection{Class probability}
 After the first training the neural network, the results were not quite as expected. Even though, buildings were predicted as buildings in most cases, other classes, like tennis courts, were predicted as buildings as well. Due to this, the network has been retrained with the additional, incorrectly predicted, classes like tennis courts. However, instead of correctly making a distinction between buildings and tennis courts, the overall prediction accuracy got worse. 
 This might be the result of the network which has to solve a more complex task now, by deciding which class it is, instead of a simple yes-no decision. Additionally, the training data is highly imbalanced, as there are lot more samples of buildings than tennis courts.
-As a result of this, a solution could be to train the network several times seperatly, to get multiple models, each trained for a specific class. Another solution could be to weight the loss according to the relative amount of the specific class according to the size of the whole dataset.
+As a result of this, a solution could be to train the network several times seperatly, to get multiple models, each trained for a specific class. Another solution could be to weight the loss according to the relative amount of the specific class according to the size of the whole data set.
 
 \subsection{Outline}
 \autoref{fig:challenges:small_predictions} shows, that the predictions are in most cases a bit to small when compared to the corresponding orthophoto. This might be the result of slightly misaligned masks, since the masks and the images are generated seperately.
diff --git a/doc/chapters/management_summary/management_summary.tex b/doc/chapters/management_summary/management_summary.tex
@@ -2,3 +2,14 @@
 
 \chapter{Management Summary}
 
+\section*{Introduction}
+tbd
+
+\section*{Goals}
+tbd
+
+\section*{Methods}
+tbd
+
+\section*{Results}
+tbd
diff --git a/doc/chapters/neural_networks/neural_networks.tex b/doc/chapters/neural_networks/neural_networks.tex
@@ -4,10 +4,10 @@ \chapter{Image Segmentation with Convolutional Neural Networks (CNN)}\label{chp:
 \section{Introduction}
 With the increasing computational power that comes with recent graphic cards, increasingly complex neural networks can be used on increasingly challenging tasks. Especially the area of image processing, deep learning gains in popularity. Not only due to the great availability of data sets but also because companies recognize the amount of knowledge and information that can be retrieved with such technologies.
 
-The following sections are a brief introduction into image segmentation using deep learning.
+The following sections are a brief introduction into image segmentation using convolutional neural networks.
 
 \subsection{Object detection and segmentation}
-Object detection exists since long before deep learning was so popular as it is now. In object detection, the goal is to determine whether an object of a specified class (for example 'car') is visible on a image. 
+Object detection exists since long before deep learning was so popular as it is now. In object detection, the goal is to determine whether an object of a specified class (for example 'car') is present on an image. Another type of object detection is with additional classification, which means to find all objects on an image together with their class and a probability that the object is actually belongs to the determined class. \autoref{fig:neural_networks:object_detection} shows an example of object detection and classification.
 
 \begin{figure}[H]
     \centering
diff --git a/doc/chapters/practical_results/images/iou_equation.png b/doc/chapters/practical_results/images/iou_equation.png
diff --git a/doc/chapters/practical_results/practical_results.tex b/doc/chapters/practical_results/practical_results.tex
@@ -41,3 +41,34 @@ \section{QGIS Plugin}
 	\caption{Changes have attributes showing the predicted class and the type of change (added, deleted, changed)}
 	\label{fig:plugin:change_attributes}
 \end{figure}
+
+\section{Prediction Accuracy}
+Normally, the accuracy of predictions of objects on orthophotos is measured using \textit{Intersection over Union} (IoU), also called \textit{Jaccard coefficient} \cite{Liu.2011} which is a measure of similarity between objects. Its calculation is shown in \autoref{fig:results:iou}.
+
+\begin{figure}[H]
+    \centering
+	\includegraphics[width=0.6\linewidth]{chapters/practical_results/images/iou_equation.png}
+	\caption{The calculation of Intersection over Union (IoU)\\Source: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ (23.06.2018)}
+	\label{fig:results:iou}
+\end{figure}
+
+However, due to its non-differentiability, the IoU is can not directly be used as loss-coefficient during the training of the neural network. Despite that, there are options how to use IoU during training shown in \cite{Bebis.2016}, \cite{Yu.20160804}.
+
+Since the goal of this thesis is not to get the most accurate predictions but to reduce the false positives and false negatives as much as possible, it does not really matter if the prediction is extremely accurate but if all objects of their corresponding classes are found. Due to this, we introduce a new metric called \textbf{Hit rate}, which simply counts if an object was found (hit) or not. It can be calculated as shown.
+
+\begin{equation}
+	Precision = \dfrac{|TP|}{|TP| + |FP|}
+\end{equation}
+and
+\begin{equation}
+	Recall = \dfrac{|TP|}{|TP| + |FN|}
+\end{equation}
+where:
+\begin{itemize}[label=]
+    \item $TP$: True positive prediction
+    \item $FP$: False positive prediction
+    \item $FN$: False negative prediction
+\end{itemize}
+
+Finally, accordingly to this metrics, our predictions have a \textbf{Precision of 95.33\%} and a \textbf{Recall of 88.96\%}. This values have been evaluated using a randomly selected batch of 150 images from the test data set.
+
diff --git a/doc/chapters/theoretical_and_experimental_results/theoretical_and_experimental_results.tex b/doc/chapters/theoretical_and_experimental_results/theoretical_and_experimental_results.tex
@@ -5,7 +5,7 @@ \section{Training data}
 \subsection{Airtiler - A data set generation tool}
 For the training, we wanted to use publicly and freely available data. Not only due to the fact, highly resolved orthophotos cost quite a lot but also to make it possible for others to reproduce the results.
 
-As a result of this, OpenStreetMap was chosen for the vector data and Microsoft Bing Maps for the imagery. A dataset consisting of satellite imagery and images for the ground truths can be created using the Python module Airtiler \cite{airtiler}. This tool has been developed by the author during this master thesis. It allows to configure one or more bounding boxes together with several other options like zoom level and OpenStreetMap attributes.
+As a result of this, OpenStreetMap was chosen for the vector data and Microsoft Bing Maps for the imagery. A data set consisting of satellite imagery and images for the ground truths can be created using the Python module Airtiler \cite{airtiler}. This tool has been developed by the author during this master thesis. It allows to configure one or more bounding boxes together with several other options like zoom level and OpenStreetMap attributes.
 
 \autoref{lst:results:airtiler_config} shows a sample configuration as it is being used by Airtiler.
 
@@ -87,7 +87,7 @@ \subsection{Airtiler - A data set generation tool}
     \end{figure}
 
 \subsection{Publicly available data sets}
-Furthermore, there are several different datasets publicly available: \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
+Furthermore, there are several different data sets publicly available: \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
 
 \section{Mapping Challenge}
 At the time of this writin the platform crowdAI hosted a challenge called Mapping Challenge \cite{mappingchallenge} which was about detecting buildings from satellite imagery. In order to gain additional knowledge regarding the performance of Mask R-CNN, we decided to participate in the challenge.
@@ -130,6 +130,3 @@ \section{Microsoft COCO Annotation Format}
 	\caption{The corresponding ground truth}
 	\label{fig:results:buildings_with_holes_gt}
 \end{figure}
-
-\section{Building detection}
-tbd
diff --git a/doc/doku.pdf b/doc/doku.pdf
diff --git a/doc/frontmatter/acknowledgments.tex b/doc/frontmatter/acknowledgments.tex
@@ -4,7 +4,7 @@
 \chapter*{Acknowledgments}
 
 \begin{description}
-    \item[Prof. Stefan Keller] tbd
+    \item[Prof. Stefan Keller] for his creativity, his visions and his support not only throughout this thesis, but also in the projects before.
     \item[My beloved wife Nadine] for supporting me whenever needed by listening, with thoughts, ideas and sometimes by doing a bit of additional housework.
     \item[My family] for always trying to understand what my thesis is actually about.
 \end{description}
diff --git a/doc/frontmatter/bibliography.bib b/doc/frontmatter/bibliography.bib
@@ -9,7 +9,7 @@ @misc{airtiler
  author = {{Martin Boos}},
  title = {Airtiler},
  url = {https://github.com/mnboos/airtiler},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
@@ -34,17 +34,27 @@ @proceedings{Banissi.2003
 }
 
 
+@proceedings{Bebis.2016,
+ abstract = {We consider the problem of learning deep neural networks~(DNNs) for object category segmentation, where the goal is to label each pixel in an image as being part of a given object (foreground) or not (background). Deep neural networks are usually trained with simple loss functions (e.g., softmax loss). These loss functions are appropriate for standard classification problems where the performance is measured by the overall classification accuracy. For object category segmentation, the two classes (foreground and background) are very imbalanced. The intersection-over-union (IoU) is usually used to measure the performance of any object category segmentation method. In this paper, we propose an approach for directly optimizing this IoU measure in deep neural networks. Our experimental results on two object category segmentation datasets demonstrate that our approach outperforms DNNs trained with standard softmax loss.},
+ year = {2016},
+ title = {Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation: Advances in Visual Computing},
+ publisher = {{Springer International Publishing}},
+ isbn = {978-3-319-50835-1},
+ editor = {Bebis, George and Boyle, Richard and Parvin, Bahram and Koracin, Darko and Porikli, Fatih and Skaff, Sandra and Entezari, Alireza and Min, Jianyuan and Iwai, Daisuke and Sadagic, Amela and Scheidegger, Carlos and Isenberg, Tobias and Rahman, Md Atiqur and Wang, Yang}
+}
+
+
 @misc{cocoformat,
  title = {COCO Data Format: Common Objects in Context},
  url = {http://cocodataset.org/#format-data},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
 @misc{deepsat,
  title = {DeepSAT},
  url = {http://csc.lsu.edu/~saikat/deepsat/},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
@@ -98,7 +108,7 @@ @article{Everingham.2010
 
 @misc{FpOhleyer.26.03.2018,
  author = {{Fp Ohleyer}},
- year = {26.03.2018},
+ year = {2018-03-18},
  title = {Segmentation},
  url = {https://project.inria.fr/aerialimagelabeling/files/2018/01/fp_ohleyer_compressed.pdf}
 }
@@ -156,23 +166,23 @@ @misc{Helber.20170831
 @misc{iou,
  author = {{Adrian Rosebrock}},
  title = {Intersection over Union (IoU): https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/},
- urldate = {19.05.2018}
+ urldate = {2018-05-19}
 }
 
 
 @misc{isprs-potsdam,
  author = {{International Society for Photogrammetry and Remote Sensing}},
  title = {Potsdam},
  url = {http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-potsdam.html},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
 @misc{isprs-vaihingen,
  author = {{International Society for Photogrammetry and Remote Sensing}},
  title = {Vaihingen},
  url = {http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-vaihingen.html},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
@@ -216,6 +226,17 @@ @misc{Lin.20170419
 }
 
 
+@book{Liu.2011,
+ author = {Liu, Bing},
+ year = {2011},
+ title = {Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data},
+ address = {Berlin / Heidelberg},
+ edition = {2},
+ publisher = {Springer-Verlag},
+ isbn = {9783642194597}
+}
+
+
 @inproceedings{Maple.2003,
  author = {Maple, C.},
  title = {Geometric design and space planning using the marching squares and marching cube algorithms},
@@ -240,22 +261,22 @@ @misc{mappingchallenge
 @misc{matterport_maskrcnn,
  title = {Mask R-CNN},
  url = {https://github.com/matterport/Mask_RCNN},
- urldate = {18.01.2018}
+ urldate = {2018-01-18}
 }
 
 
 @misc{mopublic,
- year = {16.06.2017},
+ year = {2017-06-16},
  title = {MOpublic},
  url = {https://www.cadastre.ch/content/cadastre-internet/de/manual-av/service/mopublic/_jcr_content/contentPar/tabs_copy_copy/items/dokumente/tabPar/downloadlist/downloadItems/102_1472647336994.download/Weisungen-MOpublic-de.pdf},
  keywords = {Amtliche Vermessung;MOPublic;Service {\&} Produkte AV},
- urldate = {22.06.2018}
+ urldate = {2018-06-22}
 }
 
 
 @misc{Ohleyer.26.03.2018,
  author = {Ohleyer, Fp},
- year = {26.03.2018},
+ year = {2018-03-26},
  title = {Segmentation},
  url = {https://project.inria.fr/aerialimagelabeling/files/2018/01/fp_ohleyer_compressed.pdf}
 }
@@ -328,10 +349,10 @@ @article{Song.2016
 
 
 @misc{spacenet,
- year = {30.04.2018},
+ year = {2018-04-30},
  title = {SpaceNet on Amazon Web Services (AWS)},
  url = {https://spacenetchallenge.github.io/datasets/datasetHomePage.html},
- urldate = {15.05.2018}
+ urldate = {2018-05-15}
 }
 
 
@@ -352,7 +373,7 @@ @article{Srivastava.2014
 @misc{tmsspec,
  title = {Tile Map Service Specification},
  url = {https://wiki.osgeo.org/wiki/Tile_Map_Service_Specification},
- urldate = {12.06.2018}
+ urldate = {2018-06-12}
 }
 
 
@@ -361,11 +382,20 @@ @phdthesis{VolodymyrMnih.2013
  year = {2013},
  title = {Machine Learning for Aerial Image Labeling},
  url = {https://www.cs.toronto.edu/~vmnih/data/},
- urldate = {06.01.2018},
+ urldate = {2018-01-06},
  type = {Dissertation}
 }
 
 
+@misc{Yu.20160804,
+ author = {Yu, Jiahui and Jiang, Yuning and Wang, Zhangyang and Cao, Zhimin and Huang, Thomas},
+ year = {2016},
+ title = {UnitBox: An Advanced Object Detection Network},
+ url = {http://arxiv.org/pdf/1608.01471},
+ doi = {10.1145/2964284.2967274}
+}
+
+
 @article{Zhang.2006,
  author = {Zhang, K. and Yan, J. and Chen, S.-C.},
  year = {2006},
diff --git a/inspect_model.ipynb b/inspect_model.ipynb