Finished introduction (raw version)

Signed-off-by: Jim Martens <github@2martens.de>
This commit is contained in:
Jim Martens 2019-08-13 12:24:56 +02:00
parent c9415c34fe
commit 23fce70d84
1 changed files with 72 additions and 30 deletions

102
body.tex
View File

@ -37,13 +37,24 @@ regression and classification. Regression deals with any case
where the goal for the network is to come close to an ideal
function that connects all data points. Classification, however,
describes tasks where the network is supposed to identify the
class of any given input. In this thesis, I will focus on
classification.
class of any given input. In this thesis, I will work with both.
\subsection*{Object Detection in Open Set Conditions}
\begin{figure}
\centering
\includegraphics[scale=1.0]{open-set}
\caption{Open set problem: The test set contains classes that
were not present during training time.
Icons in this image have been taken from the COCO data set
website (\url{https://cocodataset.org/\#explore}) and were
vectorized afterwards. Resembles figure 1 of Miller et al.~\cite{Miller2018}.}
\label{fig:open-set}
\end{figure}
More specifically, I will look at object detection in the open set
conditions. In non-technical words this effectively describes
conditions (see figure \ref{fig:open-set}).
In non-technical words this effectively describes
the kind of situation you encounter with CCTV cameras or robots
outside of a laboratory. Both use cameras that record
images. Subsequently a neural network analyses the image
@ -64,7 +75,7 @@ of the network as false positive.
This goes back to the need for automatic explanation. Such a system
should by itself recognize that the given object is unknown and
hence mark any classification result of the network as meaningless.
Technically there are two slightly different things that deal
Technically there are two slightly different approaches that deal
with this type of task: model uncertainty and novelty detection.
Model uncertainty can be measured with dropout sampling.
@ -80,11 +91,10 @@ low this signifies a low uncertainty. An unknown object is more
likely to cause high uncertainty which allows for an identification
of false positive cases.
Novelty detection is the more direct approach to solve the task.
Novelty detection is another approach to solve the task.
In the realm of neural networks it is usually done with the help of
auto-encoders that essentially solve a regression task of finding an
identity function that reconstructs on the output the given
input~\cite{Pimentel2014}. Auto-encoders have
auto-encoders that solve a regression task of finding an
identity function that reconstructs the given input~\cite{Pimentel2014}. Auto-encoders have
internally at least two components: an encoder, and a decoder or
generator. The job of the encoder is to find an encoding that
compresses the input as good as possible while simultaneously
@ -94,35 +104,44 @@ that reconstructs the input as accurate as possible. During
training these auto-encoders learn to reproduce a certain group
of object classes. The actual novelty detection takes place
during testing: Given an image, and the output and loss of the
auto-encoder, a novelty score is calculated. A low novelty
auto-encoder, a novelty score is calculated. For some novelty
detection approaches the reconstruction loss is exactly the novelty
score, others consider more factors. A low novelty
score signals a known object. The opposite is true for a high
novelty score.
\subsection*{Research Question}
Both presented approaches describe one way to solve the aforementioned
problem of explanation. They can be differentiated by measuring
their performance: the best theoretical idea is useless if it does
not perform well. Miller et al. have shown
some success in using dropout sampling. However, the many forward
passes during testing for every image seem computationally expensive.
In comparison a single run through a trained auto-encoder seems
intuitively to be faster. This leads to the hypothesis (see below).
Auto-encoders work well for data sets like MNIST~\cite{Deng2012}
but perform poorly on challenging real world data sets
like MS COCO~\cite{Lin2014}. Therefore, a comparison between
model uncertainty and novelty detection is considered out of
scope for this thesis.
For the purpose of this thesis, I will
use the work of Miller et al. as baseline to compare against.
They use the SSD~\cite{Liu2016} network for object detection,
modified by added dropout layers, and the SceneNet
RGB-D~\cite{McCormac2017} data set using the MS COCO~\cite{Lin2014}
classes. I will use a simple implementation of an auto-encoder and
novelty detection to compare with the work of Miller et al.
SSD for the object detection and SceneNet RGB-D as the data
set are used for both approaches.
Miller et al.~\cite{Miller2018} used an SSD pre-trained on COCO
without further fine-tuning on the SceneNet RGB-D data
set~\cite{McCormac2017} and reported good results regarding
open set error for an SSD variant with dropout sampling and entropy
thresholding.
If their results are generalizable it should be possible to replicate
the relative difference between the variants on the COCO data set.
This leads to the following hypothesis: \emph{Dropout sampling
delivers better object detection performance under open set
conditions compared to object detection without it.}
\paragraph{Hypothesis} Novelty detection using auto-encoders
delivers similar or better object detection performance under open set
conditions while being less computationally expensive compared to
dropout sampling.
For the purpose of this thesis, I will use the vanilla SSD as
baseline to compare against. In particular, vanilla SSD uses
a per-class confidence threshold of 0.01, an IOU threshold of 0.45
for the non-maximum suppression, and a top k value of 200.
The effect of an entropy threshold is measured against this vanilla
SSD by applying entropy thresholds from 0.1 to 2.4 (limits taken from
Miller et al.). Dropout sampling is compared to vanilla SSD, both
with and without entropy thresholding. The number of forward
passes is varied to identify their impact.
\paragraph{Hypothesis} Dropout sampling
delivers better object detection performance under open set
conditions compared to object detection without it.
\paragraph{Contribution}
The contribution of this thesis is a comparison between dropout
@ -131,8 +150,24 @@ of both for object detection in the open set conditions using
the SSD network for object detection and the SceneNet RGB-D data set
with MS COCO classes.
\subsection*{Reader's guide}
First, chapter \ref{chap:background} presents related works and
provides the background for dropout sampling a.k.a Bayesian SSD.
Afterwards, chapter \ref{chap:methods} explains how the Bayesian SSD
works, and provides details about the software and source code design.
Chapter \ref{chap:experiments-results} presents the data sets,
the experimental setup, and the results. This is followed by
chapter \ref{chap:discussion} and \ref{chap:closing}, focusing on
the discussion and closing respectively.
Therefore, the contribution is found in chapters \ref{chap:methods},
\ref{chap:experiments-results}, and \ref{chap:discussion}.
\chapter{Background}
\label{chap:background}
This chapter will begin with an overview over previous works
in the field of this thesis. Afterwards the theoretical foundations
of the work of Miller et al.~\cite{Miller2018} and auto-encoders will
@ -582,6 +617,8 @@ the novelty test. Nonetheless it could be the better method.
\chapter{Methods}
\label{chap:methods}
This chapter starts with the design of the source code; the
source code is so much more than a means to an end. The thesis
uses two data sets: MS COCO and SceneNet RGB-D; a section
@ -752,6 +789,8 @@ detection is out of the question under theses circumstances.
\chapter{Experimental Setup and Results}
\label{chap:experiments-results}
\section{Data sets}
\section{Experimental Setup}
@ -760,6 +799,8 @@ detection is out of the question under theses circumstances.
\chapter{Discussion}
\label{chap:discussion}
To recap, the hypothesis is repeated here.
\begin{description}
@ -786,3 +827,4 @@ was used.
\chapter{Closing}
\label{chap:closing}