Finished introduction (raw version)

Signed-off-by: Jim Martens <github@2martens.de>
2019-08-13 12:24:56 +02:00
parent c9415c34fe
commit 23fce70d84
1 changed files with 72 additions and 30 deletions
--- a/body.tex
+++ b/body.tex
@ -37,13 +37,24 @@ regression and classification. Regression deals with any case
 where the goal for the network is to come close to an ideal
 function that connects all data points. Classification, however,
 describes tasks where the network is supposed to identify the
-class of any given input. In this thesis, I will focus on
+class of any given input. In this thesis, I will work with both.
 classification.
 \subsection*{Object Detection in Open Set Conditions}
 \begin{figure}
    \centering
    \includegraphics[scale=1.0]{open-set}
    \caption{Open set problem: The test set contains classes that
    were not present during training time.
    Icons in this image have been taken from the COCO data set
    website (\url{https://cocodataset.org/\#explore}) and were
    vectorized afterwards. Resembles figure 1 of Miller et al.~\cite{Miller2018}.}
    \label{fig:open-set}
 \end{figure}
 More specifically, I will look at object detection in the open set
-conditions. In non-technical words this effectively describes
+conditions (see figure \ref{fig:open-set}).
 In non-technical words this effectively describes
 the kind of situation you encounter with CCTV cameras or robots
 outside of a laboratory. Both use cameras that record
 images. Subsequently a neural network analyses the image
@ -64,7 +75,7 @@ of the network as false positive.
 This goes back to the need for automatic explanation. Such a system
 should by itself recognize that the given object is unknown and
 hence mark any classification result of the network as meaningless.
-Technically there are two slightly different things that deal
+Technically there are two slightly different approaches that deal
 with this type of task: model uncertainty and novelty detection.
 Model uncertainty can be measured with dropout sampling.
@ -80,11 +91,10 @@ low this signifies a low uncertainty. An unknown object is more
 likely to cause high uncertainty which allows for an identification
 of false positive cases.
-Novelty detection is the more direct approach to solve the task.
+Novelty detection is another approach to solve the task.
 In the realm of neural networks it is usually done with the help of
-auto-encoders that essentially solve a regression task of finding an
+auto-encoders that solve a regression task of finding an
-identity function that reconstructs on the output the given
+identity function that reconstructs the given input~\cite{Pimentel2014}. Auto-encoders have
 input~\cite{Pimentel2014}. Auto-encoders have
 internally at least two components: an encoder, and a decoder or
 generator. The job of the encoder is to find an encoding that
 compresses the input as good as possible while simultaneously
@ -94,35 +104,44 @@ that reconstructs the input as accurate as possible. During
 training these auto-encoders learn to reproduce a certain group
 of object classes. The actual novelty detection takes place
 during testing: Given an image, and the output and loss of the
-auto-encoder, a novelty score is calculated. A low novelty
+auto-encoder, a novelty score is calculated. For some novelty
 detection approaches the reconstruction loss is exactly the novelty
 score, others consider more factors. A low novelty
 score signals a known object. The opposite is true for a high
 novelty score.
 \subsection*{Research Question}
-Both presented approaches describe one way to solve the aforementioned
+Auto-encoders work well for data sets like MNIST~\cite{Deng2012}
-problem of explanation. They can be differentiated by measuring
+but perform poorly on challenging real world data sets
-their performance: the best theoretical idea is useless if it does
+like MS COCO~\cite{Lin2014}. Therefore, a comparison between
-not perform well. Miller et al. have shown
+model uncertainty and novelty detection is considered out of
-some success in using dropout sampling. However, the many forward
+scope for this thesis.
 passes during testing for every image seem computationally expensive.
 In comparison a single run through a trained auto-encoder seems
 intuitively to be faster. This leads to the hypothesis (see below).
-For the purpose of this thesis, I will
+Miller et al.~\cite{Miller2018} used an SSD pre-trained on COCO
-use the work of Miller et al. as baseline to compare against.
+without further fine-tuning on the SceneNet RGB-D data
-They use the SSD~\cite{Liu2016} network for object detection,
+set~\cite{McCormac2017} and reported good results regarding
-modified by added dropout layers, and the SceneNet
+open set error for an SSD variant with dropout sampling and entropy
-RGB-D~\cite{McCormac2017} data set using the MS COCO~\cite{Lin2014}
+thresholding.
-classes. I will use a simple implementation of an auto-encoder and
+If their results are generalizable it should be possible to replicate
-novelty detection to compare with the work of Miller et al.
+the relative difference between the variants on the COCO data set.
-SSD for the object detection and SceneNet RGB-D as the data
+This leads to the following hypothesis: \emph{Dropout sampling
-set are used for both approaches.
+delivers better object detection performance under open set
 conditions compared to object detection without it.}
-\paragraph{Hypothesis} Novelty detection using auto-encoders
+For the purpose of this thesis, I will use the vanilla SSD as
-delivers similar or better object detection performance under open set
+baseline to compare against. In particular, vanilla SSD uses
-conditions while being less computationally expensive compared to
+a per-class confidence threshold of 0.01, an IOU threshold of 0.45
-dropout sampling.
+for the non-maximum suppression, and a top k value of 200.
 The effect of an entropy threshold is measured against this vanilla
 SSD by applying entropy thresholds from 0.1 to 2.4 (limits taken from
 Miller et al.). Dropout sampling is compared to vanilla SSD, both
 with and without entropy thresholding. The number of forward
 passes is varied to identify their impact.
 \paragraph{Hypothesis} Dropout sampling
 delivers better object detection performance under open set
 conditions compared to object detection without it.
 \paragraph{Contribution}
 The contribution of this thesis is a comparison between dropout
@ -131,8 +150,24 @@ of both for object detection in the open set conditions using
 the SSD network for object detection and the SceneNet RGB-D data set
 with MS COCO classes.
 \subsection*{Reader's guide}
 First, chapter \ref{chap:background} presents related works and
 provides the background for dropout sampling a.k.a Bayesian SSD.
 Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
 works, and provides details about the software and source code design.
 Chapter \ref{chap:experiments-results} presents the data sets,
 the experimental setup, and the results. This is followed by
 chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
 the discussion and closing respectively.
 Therefore, the contribution is found in chapters \ref{chap:methods},
 \ref{chap:experiments-results}, and \ref{chap:discussion}.
 \chapter{Background}
 \label{chap:background}
 This chapter will begin with an overview over previous works
 in the field of this thesis. Afterwards the theoretical foundations
 of the work of Miller et al.~\cite{Miller2018} and auto-encoders will
@ -582,6 +617,8 @@ the novelty test. Nonetheless it could be the better method.
 \chapter{Methods}
 \label{chap:methods}
 This chapter starts with the design of the source code; the
 source code is so much more than a means to an end. The thesis
 uses two data sets: MS COCO and SceneNet RGB-D; a section
@ -752,6 +789,8 @@ detection is out of the question under theses circumstances.
 \chapter{Experimental Setup and Results}
 \label{chap:experiments-results}
 \section{Data sets}
 \section{Experimental Setup}
@ -760,6 +799,8 @@ detection is out of the question under theses circumstances.
 \chapter{Discussion}
 \label{chap:discussion}
 To recap, the hypothesis is repeated here.
 \begin{description}
@ -786,3 +827,4 @@ was used.
 \chapter{Closing}
 \label{chap:closing}