Skip to content

Commit 3bf8a59

Browse files
committed
doc updated
1 parent 6cdf189 commit 3bf8a59

File tree

10 files changed

+550
-156
lines changed

10 files changed

+550
-156
lines changed

doc/chapters/challenges/challenges.tex

+3-3
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,17 @@
33
\chapter{Practical Challenges}\label{chp:practical_challenges}
44

55
\section{Training data}
6-
When it comes to data sets that can be used when training the neural networks, there are two options. Either one uses an already available data set, either paid or free, or a new data set is created. Currently, machine learning and especially deep learning are popular topics. As a result of this, the availability of free and publicly available data sets has increased, especially in the area of image processing like segmentation, facial recognition or object detection. Free data sets consisting of aerial imagery are \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
6+
When it comes to data sets that can be used to train the neural networks, there are two options. Either an already available data set is being used, either paid or free, or a new data set is created. Currently, machine learning and especially deep learning are popular topics. As a result of this, the availability of free and publicly available data sets has increased, especially in the area of image processing like segmentation, facial recognition or object detection. Free data sets consisting of aerial imagery are \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
77

88
Despite these available data sets, we decided to make our own, consisting solely of open data, that is Microsoft Bing for the imagery and OpenStreetMap for the vector data. Due to this, a tool named Airtiler \cite{airtiler} which is described in detail in \autoref{chp:theoretical_and_experimental_results}.
99

10-
It can be assumed, that in the future, more and more swiss cantons will made high resolution orthophotos publicly available. At the time of this writing, especially the canton of Zurich takes a pioneering role and makes several of their data sources publicly and freely available\footnote{https://geolion.zh.ch/ (15.06.18)}. However, at the time of this writing, it was not an option to use these images for this work, because it would lead to a rather small dataset.
10+
It can be assumed, that in the future, more and more swiss cantons will made high resolution orthophotos publicly available. At the time of this writing, especially the canton of Zurich takes a pioneering role and makes several of their data sources publicly and freely available\footnote{https://geolion.zh.ch/ (15.06.18)}. However, at the time of this writing, it was not an option to use these images for this work, because it would lead to a rather small data set.
1111

1212
\section{Prediction accuracy}
1313
\subsection{Class probability}
1414
After the first training the neural network, the results were not quite as expected. Even though, buildings were predicted as buildings in most cases, other classes, like tennis courts, were predicted as buildings as well. Due to this, the network has been retrained with the additional, incorrectly predicted, classes like tennis courts. However, instead of correctly making a distinction between buildings and tennis courts, the overall prediction accuracy got worse.
1515
This might be the result of the network which has to solve a more complex task now, by deciding which class it is, instead of a simple yes-no decision. Additionally, the training data is highly imbalanced, as there are lot more samples of buildings than tennis courts.
16-
As a result of this, a solution could be to train the network several times seperatly, to get multiple models, each trained for a specific class. Another solution could be to weight the loss according to the relative amount of the specific class according to the size of the whole dataset.
16+
As a result of this, a solution could be to train the network several times seperatly, to get multiple models, each trained for a specific class. Another solution could be to weight the loss according to the relative amount of the specific class according to the size of the whole data set.
1717

1818
\subsection{Outline}
1919
\autoref{fig:challenges:small_predictions} shows, that the predictions are in most cases a bit to small when compared to the corresponding orthophoto. This might be the result of slightly misaligned masks, since the masks and the images are generated seperately.

doc/chapters/management_summary/management_summary.tex

+11
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,14 @@
22

33
\chapter{Management Summary}
44

5+
\section*{Introduction}
6+
tbd
7+
8+
\section*{Goals}
9+
tbd
10+
11+
\section*{Methods}
12+
tbd
13+
14+
\section*{Results}
15+
tbd

doc/chapters/neural_networks/neural_networks.tex

+2-2
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@ \chapter{Image Segmentation with Convolutional Neural Networks (CNN)}\label{chp:
44
\section{Introduction}
55
With the increasing computational power that comes with recent graphic cards, increasingly complex neural networks can be used on increasingly challenging tasks. Especially the area of image processing, deep learning gains in popularity. Not only due to the great availability of data sets but also because companies recognize the amount of knowledge and information that can be retrieved with such technologies.
66

7-
The following sections are a brief introduction into image segmentation using deep learning.
7+
The following sections are a brief introduction into image segmentation using convolutional neural networks.
88

99
\subsection{Object detection and segmentation}
10-
Object detection exists since long before deep learning was so popular as it is now. In object detection, the goal is to determine whether an object of a specified class (for example 'car') is visible on a image.
10+
Object detection exists since long before deep learning was so popular as it is now. In object detection, the goal is to determine whether an object of a specified class (for example 'car') is present on an image. Another type of object detection is with additional classification, which means to find all objects on an image together with their class and a probability that the object is actually belongs to the determined class. \autoref{fig:neural_networks:object_detection} shows an example of object detection and classification.
1111

1212
\begin{figure}[H]
1313
\centering
Loading

doc/chapters/practical_results/practical_results.tex

+31
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,34 @@ \section{QGIS Plugin}
4141
\caption{Changes have attributes showing the predicted class and the type of change (added, deleted, changed)}
4242
\label{fig:plugin:change_attributes}
4343
\end{figure}
44+
45+
\section{Prediction Accuracy}
46+
Normally, the accuracy of predictions of objects on orthophotos is measured using \textit{Intersection over Union} (IoU), also called \textit{Jaccard coefficient} \cite{Liu.2011} which is a measure of similarity between objects. Its calculation is shown in \autoref{fig:results:iou}.
47+
48+
\begin{figure}[H]
49+
\centering
50+
\includegraphics[width=0.6\linewidth]{chapters/practical_results/images/iou_equation.png}
51+
\caption{The calculation of Intersection over Union (IoU)\\Source: https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ (23.06.2018)}
52+
\label{fig:results:iou}
53+
\end{figure}
54+
55+
However, due to its non-differentiability, the IoU is can not directly be used as loss-coefficient during the training of the neural network. Despite that, there are options how to use IoU during training shown in \cite{Bebis.2016}, \cite{Yu.20160804}.
56+
57+
Since the goal of this thesis is not to get the most accurate predictions but to reduce the false positives and false negatives as much as possible, it does not really matter if the prediction is extremely accurate but if all objects of their corresponding classes are found. Due to this, we introduce a new metric called \textbf{Hit rate}, which simply counts if an object was found (hit) or not. It can be calculated as shown.
58+
59+
\begin{equation}
60+
Precision = \dfrac{|TP|}{|TP| + |FP|}
61+
\end{equation}
62+
and
63+
\begin{equation}
64+
Recall = \dfrac{|TP|}{|TP| + |FN|}
65+
\end{equation}
66+
where:
67+
\begin{itemize}[label=]
68+
\item $TP$: True positive prediction
69+
\item $FP$: False positive prediction
70+
\item $FN$: False negative prediction
71+
\end{itemize}
72+
73+
Finally, accordingly to this metrics, our predictions have a \textbf{Precision of 95.33\%} and a \textbf{Recall of 88.96\%}. This values have been evaluated using a randomly selected batch of 150 images from the test data set.
74+

doc/chapters/theoretical_and_experimental_results/theoretical_and_experimental_results.tex

+2-5
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ \section{Training data}
55
\subsection{Airtiler - A data set generation tool}
66
For the training, we wanted to use publicly and freely available data. Not only due to the fact, highly resolved orthophotos cost quite a lot but also to make it possible for others to reproduce the results.
77

8-
As a result of this, OpenStreetMap was chosen for the vector data and Microsoft Bing Maps for the imagery. A dataset consisting of satellite imagery and images for the ground truths can be created using the Python module Airtiler \cite{airtiler}. This tool has been developed by the author during this master thesis. It allows to configure one or more bounding boxes together with several other options like zoom level and OpenStreetMap attributes.
8+
As a result of this, OpenStreetMap was chosen for the vector data and Microsoft Bing Maps for the imagery. A data set consisting of satellite imagery and images for the ground truths can be created using the Python module Airtiler \cite{airtiler}. This tool has been developed by the author during this master thesis. It allows to configure one or more bounding boxes together with several other options like zoom level and OpenStreetMap attributes.
99

1010
\autoref{lst:results:airtiler_config} shows a sample configuration as it is being used by Airtiler.
1111

@@ -87,7 +87,7 @@ \subsection{Airtiler - A data set generation tool}
8787
\end{figure}
8888

8989
\subsection{Publicly available data sets}
90-
Furthermore, there are several different datasets publicly available: \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
90+
Furthermore, there are several different data sets publicly available: \cite{VolodymyrMnih.2013}, \cite{spacenet}, \cite{isprs-vaihingen}, \cite{isprs-potsdam}, \cite{Helber.20170831}, \cite{deepsat}.
9191

9292
\section{Mapping Challenge}
9393
At the time of this writin the platform crowdAI hosted a challenge called Mapping Challenge \cite{mappingchallenge} which was about detecting buildings from satellite imagery. In order to gain additional knowledge regarding the performance of Mask R-CNN, we decided to participate in the challenge.
@@ -130,6 +130,3 @@ \section{Microsoft COCO Annotation Format}
130130
\caption{The corresponding ground truth}
131131
\label{fig:results:buildings_with_holes_gt}
132132
\end{figure}
133-
134-
\section{Building detection}
135-
tbd

doc/doku.pdf

19.9 KB
Binary file not shown.

doc/frontmatter/acknowledgments.tex

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
\chapter*{Acknowledgments}
55

66
\begin{description}
7-
\item[Prof. Stefan Keller] tbd
7+
\item[Prof. Stefan Keller] for his creativity, his visions and his support not only throughout this thesis, but also in the projects before.
88
\item[My beloved wife Nadine] for supporting me whenever needed by listening, with thoughts, ideas and sometimes by doing a bit of additional housework.
99
\item[My family] for always trying to understand what my thesis is actually about.
1010
\end{description}

doc/frontmatter/bibliography.bib

+45-15
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ @misc{airtiler
99
author = {{Martin Boos}},
1010
title = {Airtiler},
1111
url = {https://github.com/mnboos/airtiler},
12-
urldate = {15.05.2018}
12+
urldate = {2018-05-15}
1313
}
1414

1515

@@ -34,17 +34,27 @@ @proceedings{Banissi.2003
3434
}
3535

3636

37+
@proceedings{Bebis.2016,
38+
abstract = {We consider the problem of learning deep neural networks~(DNNs) for object category segmentation, where the goal is to label each pixel in an image as being part of a given object (foreground) or not (background). Deep neural networks are usually trained with simple loss functions (e.g., softmax loss). These loss functions are appropriate for standard classification problems where the performance is measured by the overall classification accuracy. For object category segmentation, the two classes (foreground and background) are very imbalanced. The intersection-over-union (IoU) is usually used to measure the performance of any object category segmentation method. In this paper, we propose an approach for directly optimizing this IoU measure in deep neural networks. Our experimental results on two object category segmentation datasets demonstrate that our approach outperforms DNNs trained with standard softmax loss.},
39+
year = {2016},
40+
title = {Optimizing Intersection-Over-Union in Deep Neural Networks for Image Segmentation: Advances in Visual Computing},
41+
publisher = {{Springer International Publishing}},
42+
isbn = {978-3-319-50835-1},
43+
editor = {Bebis, George and Boyle, Richard and Parvin, Bahram and Koracin, Darko and Porikli, Fatih and Skaff, Sandra and Entezari, Alireza and Min, Jianyuan and Iwai, Daisuke and Sadagic, Amela and Scheidegger, Carlos and Isenberg, Tobias and Rahman, Md Atiqur and Wang, Yang}
44+
}
45+
46+
3747
@misc{cocoformat,
3848
title = {COCO Data Format: Common Objects in Context},
3949
url = {http://cocodataset.org/#format-data},
40-
urldate = {15.05.2018}
50+
urldate = {2018-05-15}
4151
}
4252

4353

4454
@misc{deepsat,
4555
title = {DeepSAT},
4656
url = {http://csc.lsu.edu/~saikat/deepsat/},
47-
urldate = {15.05.2018}
57+
urldate = {2018-05-15}
4858
}
4959

5060

@@ -98,7 +108,7 @@ @article{Everingham.2010
98108

99109
@misc{FpOhleyer.26.03.2018,
100110
author = {{Fp Ohleyer}},
101-
year = {26.03.2018},
111+
year = {2018-03-18},
102112
title = {Segmentation},
103113
url = {https://project.inria.fr/aerialimagelabeling/files/2018/01/fp_ohleyer_compressed.pdf}
104114
}
@@ -156,23 +166,23 @@ @misc{Helber.20170831
156166
@misc{iou,
157167
author = {{Adrian Rosebrock}},
158168
title = {Intersection over Union (IoU): https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/},
159-
urldate = {19.05.2018}
169+
urldate = {2018-05-19}
160170
}
161171

162172

163173
@misc{isprs-potsdam,
164174
author = {{International Society for Photogrammetry and Remote Sensing}},
165175
title = {Potsdam},
166176
url = {http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-potsdam.html},
167-
urldate = {15.05.2018}
177+
urldate = {2018-05-15}
168178
}
169179

170180

171181
@misc{isprs-vaihingen,
172182
author = {{International Society for Photogrammetry and Remote Sensing}},
173183
title = {Vaihingen},
174184
url = {http://www2.isprs.org/commissions/comm3/wg4/2d-sem-label-vaihingen.html},
175-
urldate = {15.05.2018}
185+
urldate = {2018-05-15}
176186
}
177187

178188

@@ -216,6 +226,17 @@ @misc{Lin.20170419
216226
}
217227

218228

229+
@book{Liu.2011,
230+
author = {Liu, Bing},
231+
year = {2011},
232+
title = {Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data},
233+
address = {Berlin / Heidelberg},
234+
edition = {2},
235+
publisher = {Springer-Verlag},
236+
isbn = {9783642194597}
237+
}
238+
239+
219240
@inproceedings{Maple.2003,
220241
author = {Maple, C.},
221242
title = {Geometric design and space planning using the marching squares and marching cube algorithms},
@@ -240,22 +261,22 @@ @misc{mappingchallenge
240261
@misc{matterport_maskrcnn,
241262
title = {Mask R-CNN},
242263
url = {https://github.com/matterport/Mask_RCNN},
243-
urldate = {18.01.2018}
264+
urldate = {2018-01-18}
244265
}
245266

246267

247268
@misc{mopublic,
248-
year = {16.06.2017},
269+
year = {2017-06-16},
249270
title = {MOpublic},
250271
url = {https://www.cadastre.ch/content/cadastre-internet/de/manual-av/service/mopublic/_jcr_content/contentPar/tabs_copy_copy/items/dokumente/tabPar/downloadlist/downloadItems/102_1472647336994.download/Weisungen-MOpublic-de.pdf},
251272
keywords = {Amtliche Vermessung;MOPublic;Service {\&} Produkte AV},
252-
urldate = {22.06.2018}
273+
urldate = {2018-06-22}
253274
}
254275

255276

256277
@misc{Ohleyer.26.03.2018,
257278
author = {Ohleyer, Fp},
258-
year = {26.03.2018},
279+
year = {2018-03-26},
259280
title = {Segmentation},
260281
url = {https://project.inria.fr/aerialimagelabeling/files/2018/01/fp_ohleyer_compressed.pdf}
261282
}
@@ -328,10 +349,10 @@ @article{Song.2016
328349

329350

330351
@misc{spacenet,
331-
year = {30.04.2018},
352+
year = {2018-04-30},
332353
title = {SpaceNet on Amazon Web Services (AWS)},
333354
url = {https://spacenetchallenge.github.io/datasets/datasetHomePage.html},
334-
urldate = {15.05.2018}
355+
urldate = {2018-05-15}
335356
}
336357

337358

@@ -352,7 +373,7 @@ @article{Srivastava.2014
352373
@misc{tmsspec,
353374
title = {Tile Map Service Specification},
354375
url = {https://wiki.osgeo.org/wiki/Tile_Map_Service_Specification},
355-
urldate = {12.06.2018}
376+
urldate = {2018-06-12}
356377
}
357378

358379

@@ -361,11 +382,20 @@ @phdthesis{VolodymyrMnih.2013
361382
year = {2013},
362383
title = {Machine Learning for Aerial Image Labeling},
363384
url = {https://www.cs.toronto.edu/~vmnih/data/},
364-
urldate = {06.01.2018},
385+
urldate = {2018-01-06},
365386
type = {Dissertation}
366387
}
367388

368389

390+
@misc{Yu.20160804,
391+
author = {Yu, Jiahui and Jiang, Yuning and Wang, Zhangyang and Cao, Zhimin and Huang, Thomas},
392+
year = {2016},
393+
title = {UnitBox: An Advanced Object Detection Network},
394+
url = {http://arxiv.org/pdf/1608.01471},
395+
doi = {10.1145/2964284.2967274}
396+
}
397+
398+
369399
@article{Zhang.2006,
370400
author = {Zhang, K. and Yan, J. and Chen, S.-C.},
371401
year = {2006},

0 commit comments

Comments
 (0)